Support missing for model, language, and prompt fields in input_audio_transcription

Hi team,

While working with the Azure OpenAI GPT-4o real-time transcription API, I noticed that the SDK does not yet support the `model`, `language`, and `prompt` fields in the `input_audio_transcription` configuration, despite these being documented here:
📄 https://learn.microsoft.com/en-us/azure/ai-services/openai/realtime-audio-reference

These fields are critical for:
- Selecting newer models like `gpt-4o-transcribe` and `gpt-4o-mini-transcribe`
- Providing a prompt to guide transcription behavior
- Improving accuracy and latency by specifying language (e.g. `"en"`)

📌 I’ve opened a PR to address this gap:  
🔗 https://github.com/Azure-Samples/aoai-realtime-audio-sdk/pull/134

Tested and working with my application using the following config:

```ts
input_audio_transcription: {
  model: "gpt-4o-transcribe",
  language: "en",
  prompt: "Expect words related to a product design interview.",
}
```

Would love for the maintainers to review and merge this so it’s available to others using the SDK.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support missing for model, language, and prompt fields in input_audio_transcription #136

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support missing for model, language, and prompt fields in input_audio_transcription #136

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions