π’ Release Notes β Voxtral AI Demo Interface v1.0.0 (Stable)
Repository: Voxtral-AI-Demo-Local-Interface
β¨ New Features
- Initial Stable Release of the Voxtral AI Demo Interface.
- Gradio-based UI for local, interactive inference using Voxtral models.
- Dual Model Support: Compatible with both Voxtral model versions.
- Audio Upload & Playback: Supports file input and inline audio playback.
- Multilingual Transcription: Run high-quality speech-to-text across multiple languages.
- Integrated Understanding: Enables semantic analysis and long-context audio understanding.
βοΈ Improvements
- Optimized GPU handling for faster transcription.
- Responsive UI with metadata inputs and quality control options.
- Refined UX for seamless testing of audio samples.
π§ͺ Compatibility
- β Tested on CUDA-enabled environments with Python 3.10+
- β Works with both small and large Voxtral variants
- β Cross-platform (Windows, Linux)
π Installation & Usage
Clone and run locally with minimal setup:
git clone https://github.com/khalooei/Voxtral-AI-Demo-Local-Interface.git
cd Voxtral-AI-Demo-Local-Interface
pip install -r requirements.txt
python app.py
π Notes
- Requires a compatible GPU for optimal performance.
- For best results, use high-quality audio inputs (16kHz recommended).
- Future updates will include streaming input and speaker diarization support.
π Stay Connected
For issues, feedback, or contributions, visit the GitHub Issues page.