Add C API for ten-vad #2379

csukuangfj · 2025-07-12T03:57:28Z

Summary by CodeRabbit

New Features
- Added support for the ten-vad voice activity detection model alongside existing silero-vad support in C API examples and configuration.
- Users can now select between ten-vad and silero-vad models for voice activity detection in example applications.
Bug Fixes
- Improved error handling for missing model files in example applications.
Documentation
- Updated comments and usage instructions to reflect ten-vad support and model download options.
Chores
- Enhanced automated tests to cover ten-vad integration with various speech recognition models.

coderabbitai · 2025-07-12T03:57:34Z

Walkthrough

Support for the "ten-vad" voice activity detection model was added across the C API, C++ API, and example programs. Workflow tests were updated to include "ten-vad" scenarios. Configuration structs and logic now handle both "silero-vad" and "ten-vad" models, with conditional runtime selection and parameter initialization based on available model files.

Changes

File(s)	Change Summary
.github/workflows/c-api.yaml	Added workflow jobs for "ten-vad" with Whisper, Moonshine, and sense-voice; renamed existing "silero-vad" jobs.
c-api-examples/vad-moonshine-c-api.c c-api-examples/vad-sense-voice-c-api.c c-api-examples/vad-whisper-c-api.c	Example programs now support both "silero-vad" and "ten-vad" models, with dynamic selection and config updates.
sherpa-onnx/c-api/c-api.h	Added `SherpaOnnxTenVadModelConfig` struct, updated main config struct, and declared `SherpaOnnxFileExists`.
sherpa-onnx/c-api/c-api.cc	Initialized `vad_config.ten_vad` fields in `GetVadModelConfig`.
sherpa-onnx/c-api/cxx-api.h	Added `TenVadModelConfig` struct, updated `VadModelConfig`, and declared `FileExists` function.
sherpa-onnx/c-api/cxx-api.cc	Supported `ten_vad` config in `VoiceActivityDetector::Create`; implemented `FileExists` utility.
sherpa-onnx/csrc/ten-vad-model.cc	Changed constant from `1e-10` to `1e-10f` for float precision in `LogMel`.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant ExampleApp
    participant FileSystem
    participant SherpaOnnxAPI

    User->>ExampleApp: Run with input WAV and VAD models
    ExampleApp->>FileSystem: Check for silero_vad.onnx
    alt silero-vad exists
        ExampleApp->>SherpaOnnxAPI: Initialize with silero-vad config
    else ten-vad.onnx exists
        ExampleApp->>SherpaOnnxAPI: Initialize with ten-vad config
    else
        ExampleApp->>User: Print error and exit
    end
    ExampleApp->>SherpaOnnxAPI: Process audio using selected VAD

Possibly related PRs

k2-fsa/sherpa-onnx#2377: Implements support for the "ten-vad" VAD model, including config structs, model selection, and example updates—directly related at the code level.

Poem

In the land of code where models dwell,
A new VAD hops in—ten-vad as well!
Now silero and ten-vad both can play,
Detecting speech in a clever way.
With configs set and tests anew,
This rabbit cheers, “Great job, crew!”
🐇✨

✨ Finishing Touches

📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

Copilot

Pull Request Overview

This PR adds comprehensive C API support for the ten-vad voice activity detection model, extending the existing VAD functionality beyond the current silero-vad implementation. The changes enable developers to use ten-vad as an alternative VAD model through both C and C++ APIs.

Adds TenVadModelConfig structures and configuration handling for ten-vad integration
Updates example applications to support both silero-vad and ten-vad with automatic model detection
Extends CI/CD workflows to test ten-vad functionality alongside existing silero-vad tests

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
sherpa-onnx/csrc/ten-vad-model.cc	Fixes float literal precision in LogMel function
sherpa-onnx/c-api/cxx-api.h	Adds TenVadModelConfig struct and FileExists function declaration
sherpa-onnx/c-api/cxx-api.cc	Implements ten-vad configuration mapping and FileExists wrapper
sherpa-onnx/c-api/c-api.h	Adds C struct definitions for ten-vad and reorganizes function declarations
sherpa-onnx/c-api/c-api.cc	Implements ten-vad configuration parsing with default values
c-api-examples/vad-whisper-c-api.c	Updates example to support both VAD models with automatic detection
c-api-examples/vad-sense-voice-c-api.c	Updates example to support both VAD models with automatic detection
c-api-examples/vad-moonshine-c-api.c	Updates example to support both VAD models with automatic detection
.github/workflows/c-api.yaml	Adds CI test jobs for ten-vad integration with all example applications

sherpa-onnx/c-api/c-api.h

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

c-api-examples/vad-sense-voice-c-api.c (1)

107-121: Consider refactoring common VAD configuration logic.

While the current implementation is correct, the identical VAD configuration logic across all three example files could benefit from a shared utility function to reduce code duplication and improve maintainability.

Consider creating a helper function like:

void configure_vad_model(SherpaOnnxVadModelConfig* config, 
                        const char* vad_filename, 
                        int32_t use_silero_vad, 
                        int32_t use_ten_vad) {
  if (use_silero_vad) {
    config->silero_vad.model = vad_filename;
    config->silero_vad.threshold = 0.25;
    config->silero_vad.min_silence_duration = 0.5;
    config->silero_vad.min_speech_duration = 0.5;
    config->silero_vad.max_speech_duration = 10;
    config->silero_vad.window_size = 512;
  } else if (use_ten_vad) {
    config->ten_vad.model = vad_filename;
    config->ten_vad.threshold = 0.25;
    config->ten_vad.min_silence_duration = 0.5;
    config->ten_vad.min_speech_duration = 0.5;
    config->ten_vad.max_speech_duration = 10;
    config->ten_vad.window_size = 256;
  }
}

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between da9f303 and e362da4.

📒 Files selected for processing (9)

.github/workflows/c-api.yaml (5 hunks)
c-api-examples/vad-moonshine-c-api.c (4 hunks)
c-api-examples/vad-sense-voice-c-api.c (4 hunks)
c-api-examples/vad-whisper-c-api.c (4 hunks)
sherpa-onnx/c-api/c-api.cc (1 hunks)
sherpa-onnx/c-api/c-api.h (2 hunks)
sherpa-onnx/c-api/cxx-api.cc (2 hunks)
sherpa-onnx/c-api/cxx-api.h (2 hunks)
sherpa-onnx/csrc/ten-vad-model.cc (1 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (3)

c-api-examples/vad-whisper-c-api.c (2)

sherpa-onnx/c-api/c-api.h (1)

SherpaOnnxFileExists (75-75)

sherpa-onnx/c-api/c-api.cc (2)

SherpaOnnxFileExists (2031-2033)

SherpaOnnxFileExists (2031-2031)

c-api-examples/vad-moonshine-c-api.c (2)

sherpa-onnx/c-api/c-api.h (1)

SherpaOnnxFileExists (75-75)

sherpa-onnx/c-api/c-api.cc (2)

SherpaOnnxFileExists (2031-2033)

SherpaOnnxFileExists (2031-2031)

c-api-examples/vad-sense-voice-c-api.c (2)

sherpa-onnx/c-api/c-api.h (1)

SherpaOnnxFileExists (75-75)

sherpa-onnx/c-api/c-api.cc (2)

SherpaOnnxFileExists (2031-2033)

SherpaOnnxFileExists (2031-2031)

🔇 Additional comments (28)

sherpa-onnx/csrc/ten-vad-model.cc (1)

324-324: LGTM! Good type consistency improvement.

Explicitly marking the literal as 1e-10f ensures type consistency since logf() takes float arguments and the rest of the expression uses float literals.

sherpa-onnx/c-api/c-api.cc (1)

1036-1049: LGTM! Consistent implementation following established patterns.

The ten-vad configuration initialization properly mirrors the existing silero-vad pattern, using the same macro approach for default value handling. The different window_size default (256 vs 512) appears intentional for the ten-vad model characteristics.

sherpa-onnx/c-api/cxx-api.h (3)

555-562: LGTM! Well-structured configuration struct.

The TenVadModelConfig struct follows the established pattern of SileroVadModelConfig with consistent field types, naming, and appropriate default values that align with the C API implementation.

566-566: LGTM! Consistent API extension.

Adding the ten_vad member to VadModelConfig properly extends the API to support the new VAD model while maintaining consistency with the existing silero_vad structure.

655-655: LGTM! Useful utility function addition.

The FileExists function declaration provides a clean C++ API wrapper that will be helpful for the runtime model detection functionality described in the AI summary.

c-api-examples/vad-whisper-c-api.c (5)

11-16: LGTM! Clear documentation for VAD model downloads.

The added comments provide clear instructions for downloading both supported VAD models, improving user experience.

34-37: Good addition of input validation.

The file existence check for the input WAV file prevents runtime failures and provides clear error messaging.

39-54: Well-implemented VAD model selection logic.

The runtime detection and selection between silero-vad and ten-vad models is correctly implemented with proper priority (silero-vad first) and clear error handling when neither model is available.

104-118: Correct conditional VAD configuration.

The model-specific parameter configuration is properly implemented with appropriate window sizes (512 for silero-vad, 256 for ten-vad) and consistent threshold/duration settings across both models.

134-135: Appropriate window size selection.

The conditional window size retrieval based on the selected VAD model ensures the processing loop uses the correct parameters for the chosen model.

sherpa-onnx/c-api/cxx-api.cc (2)

658-663: Correct ten_vad configuration implementation.

The ten_vad configuration fields are properly copied from the C++ config to the C struct, following the same pattern as the existing silero_vad configuration. All necessary fields (model, threshold, timing parameters, window_size) are included.

768-770: Simple and correct utility function.

The FileExists function is properly implemented as a thin wrapper around the C API function, providing a clean C++ interface for file existence checking.

c-api-examples/vad-moonshine-c-api.c (4)

9-14: Consistent documentation across examples.

The VAD model download instructions are identical to the whisper example, maintaining consistency across different example programs.

31-51: Consistent implementation of VAD model selection.

The file existence checks and VAD model selection logic exactly match the whisper example, ensuring uniform behavior across all example programs.

104-118: Identical VAD configuration pattern.

The conditional VAD configuration follows the same pattern as other examples, with appropriate model-specific parameters (window sizes, thresholds) consistently applied.

134-135: Consistent window size handling.

The window size selection logic matches the pattern used in other examples, ensuring uniform behavior across all VAD-enabled example programs.

c-api-examples/vad-sense-voice-c-api.c (3)

9-14: Excellent consistency across all examples.

The VAD model documentation is identical across all three example programs, providing a uniform user experience regardless of which ASR model is being used.

31-51: Perfect implementation consistency.

The VAD model selection logic is identical across all examples, demonstrating excellent code consistency and maintainability.

137-138: Consistent window size implementation.

The final example maintains the same window size selection pattern, completing the consistent implementation across all VAD-enabled examples.

sherpa-onnx/c-api/c-api.h (3)

74-76: LGTM: Well-designed utility function.

The SherpaOnnxFileExists function follows established API conventions with clear documentation and appropriate return type semantics. The placement after version functions is logical.

851-874: LGTM: Consistent struct design for ten-vad model.

The SherpaOnnxTenVadModelConfig struct appropriately mirrors the existing silero VAD configuration, providing type safety while maintaining consistency in field names and documentation. This duplication is standard practice for C APIs.

882-882: LGTM: Proper extension of VAD model configuration.

The addition of the ten_vad member to SherpaOnnxVadModelConfig correctly extends the API to support the new VAD model while maintaining backward compatibility.

.github/workflows/c-api.yaml (6)

379-379: LGTM: Improved test naming clarity.

Renaming the test to specify "silero-vad" improves clarity now that multiple VAD models are supported.

406-432: LGTM: Comprehensive test coverage for ten-vad model.

The new test job appropriately mirrors the silero-vad test structure while using the ten-vad model. This ensures consistent test coverage across both VAD implementations.

433-433: LGTM: Consistent test naming improvement.

The test name clarification maintains consistency with other VAD test naming changes.

460-486: LGTM: Consistent test coverage extension.

The ten-vad + Moonshine test maintains the established pattern and provides necessary coverage for the new VAD model with a different ASR backend.

523-523: LGTM: Maintains naming consistency.

The test name update follows the established pattern for clarity across all VAD tests.

562-600: LGTM: Complete test coverage for ten-vad integration.

The final ten-vad test with sense-voice completes comprehensive test coverage across all ASR models. The inclusion of diagnostic commands is helpful for troubleshooting.

Add C API for ten-vad

e362da4

csukuangfj requested a review from Copilot July 12, 2025 03:57

Copilot AI reviewed Jul 12, 2025

View reviewed changes

sherpa-onnx/c-api/c-api.h Show resolved Hide resolved

coderabbitai bot reviewed Jul 12, 2025

View reviewed changes

csukuangfj merged commit ceb1bc5 into k2-fsa:master Jul 12, 2025
118 of 228 checks passed

csukuangfj deleted the c-api-ten-vad branch July 12, 2025 04:04

This was referenced Jul 12, 2025

Add JavaScript (WebAssembly) API for ten-vad #2382

Merged

Add Dart API for ten-vad #2386

Merged

Add C# API for ten-vad #2385

Merged

Add Swift API for ten-vad #2387

Merged

Add Java/Kotlin API and Android support for ten-vad #2389

Merged

csukuangfj mentioned this pull request Jul 12, 2025

Request to open-source the model TEN-framework/ten-vad#15

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add C API for ten-vad #2379

Add C API for ten-vad #2379

Uh oh!

csukuangfj commented Jul 12, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jul 12, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add C API for ten-vad #2379

Add C API for ten-vad #2379

Uh oh!

Conversation

csukuangfj commented Jul 12, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jul 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

csukuangfj commented Jul 12, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jul 12, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)