Skip to content

[Feature]: Add support for new languages such as Hebrew / Arabic #626

@Amit-Nadam

Description

@Amit-Nadam

Summary

Deepgram currently offers strong multilingual support through models like Nova-2 and Nova-3, but Hebrew and Arabic are not yet supported in the main models (only partially through Whisper batch mode). Adding these languages would significantly expand accessibility for users and developers in the Middle East region who rely on accurate real-time speech recognition.

Problem to solve

Many organizations, research teams, and developers working in the Middle East need high-quality ASR (Automatic Speech Recognition) for Hebrew and Arabic, including regional dialects.
Current alternatives (e.g., Whisper, Google Cloud STT) are not optimized for low-latency or high-accuracy streaming in these languages, limiting their use in Academic and linguistic research. Without native model support, Deepgram users must rely on slower, less accurate batch transcriptions via Whisper — missing out on Deepgram’s strengths in streaming and diarization.

Proposed solution

1. Add Hebrew (language=he) and Arabic (language=ar) support to Deepgrams main Nova-3 (and future) speech-to-text models.
2. Provide both batch and real-time streaming capabilities, ensuring compatibility with existing Deepgram API parameters.
3. Update the Models & Languages Overview documentation to reflect new language availability and usage examples.
This approach will make Deepgrams ASR accessible and practical for a broader developer and enterprise audience in the Middle East, especially for real-time transcription and voice analytics.

Alternatives considered

No response

Scope

Python only

Priority

Important

Extra context / links

To support the addition of Hebrew and Arabic to Deepgram’s models, here are some open-source and research projects that already provide high-quality data and models in these languages:

  1. Ivrit.ai
  2. ArTST
    These open datasets and models show strong community and research interest in Hebrew and Arabic speech recognition. Integrating these languages into Deepgram’s Nova model family would leverage existing resources, accelerate adoption across new regions, and align with Deepgram’s global accessibility goals.

Session ID (optional)

No response

Project ID (optional)

No response

Request ID (optional)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions