A cross-model Voice Activity Detection (VAD) tool with real-time and file-based analysis using both WebRTC VAD and Silero VAD models, built in PyQt6. Designed for visualization, evaluation, and comparison.
- 🎙️ Live microphone-based VAD
- 📂 File-based VAD analysis
- 📊 Comparison mode: WebRTC vs Silero metrics side by side
- 📈 Generates:
- Spectrograms
- Waveforms
- Confusion matrices
- Metric charts (Accuracy, Precision, Recall, F1-score, etc.)
- 🧠 Silero model with intelligent frame-by-frame classification
- 🌐 WebRTC model integrated for fast binary VAD
- Python 3.9+
- PyQt6
- matplotlib, numpy
- simpleaudio
- wave
- WebRTC VAD wrapper
- Silero VAD
- Clone the repo
git clone https://github.com/edyamza/Voice-Activity-Detection-WebRTC-Silero.git
cd Voice-Activity-Detection-WebRTC-Silero
- Install dependencies
pip install -r requirements.txt
- Run the app
python vad_guide.py
├── vad_guide.py # Main GUI application
├── vad_rec.py # WebRTC file-based VAD
├── vad_rec_silero.py # Silero VAD interface
├── vad_live.py # Live audio processing
├── evaluation.py # Metrics & graph generation
├── output/ # Saved plots and images
└── requirements.txt
This project is licensed under the MIT License.
Eduard Amza — GitHub
Feel free to ⭐ the project or contribute!