Skip to content

Stable-v1.0.0

Latest
Compare
Choose a tag to compare
@khalooei khalooei released this 26 Jul 20:46
· 2 commits to main since this release
936ac1e

πŸ“’ Release Notes – Voxtral AI Demo Interface v1.0.0 (Stable)

Repository: Voxtral-AI-Demo-Local-Interface

✨ New Features

  • Initial Stable Release of the Voxtral AI Demo Interface.
  • Gradio-based UI for local, interactive inference using Voxtral models.
  • Dual Model Support: Compatible with both Voxtral model versions.
  • Audio Upload & Playback: Supports file input and inline audio playback.
  • Multilingual Transcription: Run high-quality speech-to-text across multiple languages.
  • Integrated Understanding: Enables semantic analysis and long-context audio understanding.

βš™οΈ Improvements

  • Optimized GPU handling for faster transcription.
  • Responsive UI with metadata inputs and quality control options.
  • Refined UX for seamless testing of audio samples.

πŸ§ͺ Compatibility

  • βœ… Tested on CUDA-enabled environments with Python 3.10+
  • βœ… Works with both small and large Voxtral variants
  • βœ… Cross-platform (Windows, Linux)

πŸ“ Installation & Usage
Clone and run locally with minimal setup:

git clone https://github.com/khalooei/Voxtral-AI-Demo-Local-Interface.git
cd Voxtral-AI-Demo-Local-Interface
pip install -r requirements.txt
python app.py

πŸ“Œ Notes

  • Requires a compatible GPU for optimal performance.
  • For best results, use high-quality audio inputs (16kHz recommended).
  • Future updates will include streaming input and speaker diarization support.

πŸ”— Stay Connected
For issues, feedback, or contributions, visit the GitHub Issues page.