Skip to content

edyamza/Voice-Activity-Detection-WebRTC-Silero

Repository files navigation

🗣️ Voice Activity Detection with WebRTC & Silero

A cross-model Voice Activity Detection (VAD) tool with real-time and file-based analysis using both WebRTC VAD and Silero VAD models, built in PyQt6. Designed for visualization, evaluation, and comparison.


🚀 Features

  • 🎙️ Live microphone-based VAD
  • 📂 File-based VAD analysis
  • 📊 Comparison mode: WebRTC vs Silero metrics side by side
  • 📈 Generates:
    • Spectrograms
    • Waveforms
    • Confusion matrices
    • Metric charts (Accuracy, Precision, Recall, F1-score, etc.)
  • 🧠 Silero model with intelligent frame-by-frame classification
  • 🌐 WebRTC model integrated for fast binary VAD

🧰 Tech Stack

  • Python 3.9+
  • PyQt6
  • matplotlib, numpy
  • simpleaudio
  • wave
  • WebRTC VAD wrapper
  • Silero VAD

💻 Installation

  1. Clone the repo
git clone https://github.com/edyamza/Voice-Activity-Detection-WebRTC-Silero.git
cd Voice-Activity-Detection-WebRTC-Silero
  1. Install dependencies
pip install -r requirements.txt
  1. Run the app
python vad_guide.py

🖼️ GUI Preview

🎛️ Main Interface

📊 Confusion Matrices – WebRTC vs Silero

📈 Metric Comparison Bar Chart

🔊 Spectrogram + Waveform View


📁 Project Structure

├── vad_guide.py            # Main GUI application
├── vad_rec.py              # WebRTC file-based VAD
├── vad_rec_silero.py       # Silero VAD interface
├── vad_live.py             # Live audio processing
├── evaluation.py           # Metrics & graph generation
├── output/                 # Saved plots and images
└── requirements.txt

📄 License

This project is licensed under the MIT License.


✨ Author

Eduard AmzaGitHub


🧠 Inspired by


Feel free to ⭐ the project or contribute!

About

This is a python project. We compare the metrics of 2 already trained AI models - WebRTC & Silero.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages