Skip to content

Support missing for model, language, and prompt fields in input_audio_transcription #136

@HVbajoria

Description

@HVbajoria

Hi team,

While working with the Azure OpenAI GPT-4o real-time transcription API, I noticed that the SDK does not yet support the model, language, and prompt fields in the input_audio_transcription configuration, despite these being documented here:
📄 https://learn.microsoft.com/en-us/azure/ai-services/openai/realtime-audio-reference

These fields are critical for:

  • Selecting newer models like gpt-4o-transcribe and gpt-4o-mini-transcribe
  • Providing a prompt to guide transcription behavior
  • Improving accuracy and latency by specifying language (e.g. "en")

📌 I’ve opened a PR to address this gap:
🔗 #134

Tested and working with my application using the following config:

input_audio_transcription: {
  model: "gpt-4o-transcribe",
  language: "en",
  prompt: "Expect words related to a product design interview.",
}

Would love for the maintainers to review and merge this so it’s available to others using the SDK.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions