-
Notifications
You must be signed in to change notification settings - Fork 1k
Add Pascal/Go/C#/Dart API for NeMo Canary ASR models #2367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis update introduces NeMo Canary model support for offline speech recognition across Dart, Go, C#, and Pascal APIs. New example programs, configuration structures, and workflow/test scripts are added for each language. The core libraries and bindings are extended to support a new "canary" model configuration, including dynamic reconfiguration of recognizer settings at runtime. Changes
Sequence Diagram(s)Example: High-Level Flow for Multi-language NeMo Canary Offline ASRsequenceDiagram
participant User
participant Script (run.sh/run-nemo-canary.sh)
participant Example Program (Dart/Go/C#/Pascal)
participant SherpaOnnx Library
User->>Script: Start script
Script->>Script: Download/check NeMo Canary model
Script->>Example Program: Run example with config (srcLang, tgtLang, wav)
Example Program->>SherpaOnnx Library: Initialize recognizer with config
Example Program->>SherpaOnnx Library: Feed waveform, decode
SherpaOnnx Library-->>Example Program: Return transcription
Example Program->>Script: Output result
Script->>Example Program: (Optional) Re-run with different tgtLang
Example Program->>SherpaOnnx Library: Update config (setConfig)
Example Program->>SherpaOnnx Library: Feed waveform, decode
SherpaOnnx Library-->>Example Program: Return new transcription
Example Program->>Script: Output new result
Script->>Script: Cleanup model files
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Adds support for the new NeMo Canary ASR model across multiple language bindings by introducing a Canary model configuration and setter method on the offline recognizer.
- Define
Canarymodel config in Pascal, Go, C#, Dart and update converter functions - Add
SetConfigAPI for offline recognizer (Pascal, Go, C#, Dart) - Extend CI/workflows and example scripts to test non-streaming Canary decoding
Reviewed Changes
Copilot reviewed 26 out of 27 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| sherpa-onnx/pascal-api/sherpa_onnx.pas | Added Pascal record for Canary config and SetConfig |
| sherpa-onnx/csrc/online-transducer-nemo-model.cc | Allow empty normalize_type |
| scripts/go/sherpa_onnx.go | Added Go struct and marshalling for Canary config |
| scripts/go/_internal/non-streaming-canary-decode-files/* | Example scaffold for internal Canary decode tests |
| scripts/go/.../non-streaming-canary-decode-files/* | Go example and CI script for Canary decode |
| scripts/dotnet/*.cs | Added C# OfflineCanaryModelConfig and SetConfig |
| flutter/sherpa_onnx/lib/src/{sherpa_onnx_bindings.dart, offline_recognizer.dart} | Added Dart FFI and high-level Canary config/setConfig |
| † CI workflows (.github) | Added CI steps to build/test Canary examples |
Comments suppressed due to low confidence (1)
sherpa-onnx/pascal-api/sherpa_onnx.pas:1701
- There's a typo in the function name: "Recongizer" should be spelled "Recognizer" for consistency.
function ConvertOfflineRecongizerConfig(Config: TSherpaOnnxOfflineRecognizerConfig): SherpaOnnxOfflineRecognizerConfig;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
🔭 Outside diff range comments (5)
scripts/go/_internal/non-streaming-canary-decode-files/run.sh (2)
1-14: Script lacks she-bang and contains an un-guarded path line ⇒ will error out.
../../../../go-api-examples/non-streaming-canary-decode-files/run.shon line 1 is executed as a shell command instead of being sourced/commented.
Moreover, the file has no#!/usr/bin/env bash, so behaviour depends on whichever shell runs it.-../../../../go-api-examples/non-streaming-canary-decode-files/run.sh +#!/usr/bin/env bash +set -euo pipefail + +# Delegate to the original script. `exec` keeps the same PID and exits with +# its status. +exec "$(dirname "$0")/../../../../go-api-examples/non-streaming-canary-decode-files/run.sh"If you really need the local copy of the commands, drop the path line entirely.
3-14: Duplicate logic – keep a single source of truth.Lines 3-14 replicate almost the same model-download / build steps as the upstream script referenced on line 1. When the upstream script changes this copy will silently diverge. Prefer
exec-ing or sourcing the original script instead.scripts/go/_internal/non-streaming-canary-decode-files/main.go (2)
1-4: File won’t compile – missingpackagedeclaration and stray path.The first line is a bare path and there is no
package mainstatement, sogo buildfails with expected ‘package’, found “../../…”.-../../../../go-api-examples/non-streaming-canary-decode-files/main.go +package main + +// (If this wrapper is only meant to re-export the real file, consider deleting +// everything below and using `//go:build ignore` to avoid compilation.)
61-63: Error fromos.Openis silently ignored.file, _ := os.Open(filename)Swallowing the error may lead to a nil pointer dereference later. Capture and handle it:
-file, _ := os.Open(filename) +file, err := os.Open(filename) +if err != nil { + log.Fatalf("Failed to open %s: %v", filename, err) +}sherpa-onnx/pascal-api/sherpa_onnx.pas (1)
1701-1770: Fix the typo in the function name.The function name contains a typo: "Recongizer" should be "Recognizer".
-function ConvertOfflineRecongizerConfig(Config: TSherpaOnnxOfflineRecognizerConfig): SherpaOnnxOfflineRecognizerConfig; +function ConvertOfflineRecognizerConfig(Config: TSherpaOnnxOfflineRecognizerConfig): SherpaOnnxOfflineRecognizerConfig;
🧹 Nitpick comments (10)
scripts/go/_internal/non-streaming-canary-decode-files/main.go (2)
65-69: Log message hides the root cause.
log.Fatalf("Failed to read wave format")discardserr. Include it for easier debugging:- log.Fatalf("Failed to read wave format") + log.Fatalf("Failed to read wave format: %v", err)
100-110: Inefficient per-sample allocation in inner loop.Creating a new
bytes.Readeron every sample causes ~80 × allocations per second of audio. Decode the whole buffer in a tight loop without allocs, e.g.:for i := 0; i < numSamples; i++ { s16 := int16(binary.LittleEndian.Uint16(inSamples[i*2:])) outSamples[i] = float32(s16) / 32768 }dotnet-examples/non-streaming-canary-decode-files/non-streaming-canary-decode-files.csproj (1)
4-9: Consider Pascal-case root namespace.
non_streaming_canary_decode_filesbreaks the usual .NET naming guidelines.
NonStreamingCanaryDecodeFileswould read better and avoid the underscore..github/workflows/pascal.yaml (1)
159-162: Add explicit failure-propagation when invokingrun-nemo-canary.sh.
./run-nemo-canary.shis executed directly.
If that script ever forgetsset -e(or gets committed without the exec bit), the workflow would silently continue and mask failures.- ./run-nemo-canary.sh + bash -eo pipefail ./run-nemo-canary.shThis guarantees the CI job aborts on any unexpected error inside the script.
.github/workflows/test-go.yaml (1)
111-112: Avoid repetitivecpstatements – keep the list DRY.You now have 15 identical
cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll …lines.
Consider looping to reduce copy-paste maintenance:-for dir in add-punctuation audio-tagging keyword-spotting-from-file non-streaming-canary-decode-files \ - non-streaming-decode-files non-streaming-speaker-diarization non-streaming-tts \ - speaker-identification speech-enhancement-gtcrn streaming-decode-files \ - streaming-hlg-decoding vad vad-asr-paraformer vad-asr-whisper \ - vad-speaker-identification vad-spoken-language-identification; do - cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/$dir/ -done +for dir in $(printf "%s\n" ../scripts/go/_internal/*/ | sed 's|.*/||'); do + cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll "../scripts/go/_internal/${dir}/" +doneLess duplication → easier future updates.
.github/workflows/test-go-package.yaml (1)
79-86: Keep Go example build steps consistent.Other examples run
go mod tidy && go buildbefore./run.sh; this step skips
those, relying on the script to build. For consistency and faster failure
detection:- cd go-api-examples/non-streaming-canary-decode-files - ./run.sh + cd go-api-examples/non-streaming-canary-decode-files + go mod tidy + go build + ./run.shMinor, but keeps the workflow uniform.
dotnet-examples/non-streaming-canary-decode-files/run.sh (1)
5-11: Pin build configuration and guard against partial downloads.
tar xvfauto-detects compression on GNU tar but fails on BSD tar/macOS
without-j.dotnet rundefaults to Debug; Release artefacts are smaller & faster.- tar xvf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2 + tar xjvf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2 ... -dotnet run +dotnet run --configuration Release --no-restoreAlso consider verifying checksum (e.g.
sha256sum) before extraction for extra safety.scripts/dotnet/OfflineRecognizer.cs (1)
17-20: Consider adding error handling and parameter validation.The SetConfig method implementation is correct but lacks error handling. Consider adding null checks and exception handling for potential native call failures.
public void SetConfig(OfflineRecognizerConfig config) { + if (_handle.Handle == IntPtr.Zero) + throw new ObjectDisposedException(nameof(OfflineRecognizer)); + SherpaOnnxOfflineRecognizerSetConfig(_handle.Handle, ref config); }scripts/dotnet/OfflineCanaryModelConfig.cs (1)
7-32: Well-designed struct for interop with appropriate marshaling attributes.The struct definition is excellent for P/Invoke scenarios:
- Sequential layout ensures proper memory alignment
- String marshaling attributes are correctly applied
- Default values are sensible (English language, punctuation enabled)
- Field ordering matches the native counterpart
Consider adding XML documentation comments to describe the purpose of each field.
dart-api-examples/non-streaming-asr/bin/nemo-canary.dart (1)
64-70: Consider using a more robust configuration update approach.The JSON manipulation for updating the target language is clever but fragile. Consider using a more type-safe approach.
- var json = config.toJson(); - ((json['model'] as Map<String, dynamic>)!['canary'] - as Map<String, dynamic>)!['tgtLang'] = 'en'; - config = sherpa_onnx.OfflineRecognizerConfig.fromJson(json); + // More robust approach using constructor + final newCanary = sherpa_onnx.OfflineCanaryModelConfig( + encoder: canary.encoder, + decoder: canary.decoder, + srcLang: canary.srcLang, + tgtLang: 'en', + ); + final newModelConfig = sherpa_onnx.OfflineModelConfig( + canary: newCanary, + tokens: modelConfig.tokens, + debug: modelConfig.debug, + numThreads: modelConfig.numThreads, + ); + config = sherpa_onnx.OfflineRecognizerConfig(model: newModelConfig);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (27)
.github/scripts/test-dot-net.sh(1 hunks).github/workflows/pascal.yaml(1 hunks).github/workflows/test-go-package.yaml(1 hunks).github/workflows/test-go.yaml(2 hunks)dart-api-examples/non-streaming-asr/bin/nemo-canary.dart(1 hunks)dart-api-examples/non-streaming-asr/run-nemo-canary.sh(1 hunks)dotnet-examples/non-streaming-canary-decode-files/Program.cs(1 hunks)dotnet-examples/non-streaming-canary-decode-files/non-streaming-canary-decode-files.csproj(1 hunks)dotnet-examples/non-streaming-canary-decode-files/run.sh(1 hunks)dotnet-examples/sherpa-onnx.sln(2 hunks)flutter/sherpa_onnx/lib/src/offline_recognizer.dart(10 hunks)flutter/sherpa_onnx/lib/src/sherpa_onnx_bindings.dart(5 hunks)go-api-examples/non-streaming-canary-decode-files/go.mod(1 hunks)go-api-examples/non-streaming-canary-decode-files/main.go(1 hunks)go-api-examples/non-streaming-canary-decode-files/run.sh(1 hunks)pascal-api-examples/non-streaming-asr/.gitignore(1 hunks)pascal-api-examples/non-streaming-asr/nemo_canary.pas(1 hunks)pascal-api-examples/non-streaming-asr/run-nemo-canary.sh(1 hunks)scripts/dotnet/OfflineCanaryModelConfig.cs(1 hunks)scripts/dotnet/OfflineModelConfig.cs(2 hunks)scripts/dotnet/OfflineRecognizer.cs(2 hunks)scripts/go/_internal/non-streaming-canary-decode-files/go.mod(1 hunks)scripts/go/_internal/non-streaming-canary-decode-files/main.go(1 hunks)scripts/go/_internal/non-streaming-canary-decode-files/run.sh(1 hunks)scripts/go/sherpa_onnx.go(4 hunks)sherpa-onnx/csrc/online-transducer-nemo-model.cc(1 hunks)sherpa-onnx/pascal-api/sherpa_onnx.pas(12 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
scripts/dotnet/OfflineModelConfig.cs (3)
scripts/dotnet/OfflineCanaryModelConfig.cs (1)
OfflineCanaryModelConfig(10-17)scripts/go/sherpa_onnx.go (1)
OfflineCanaryModelConfig(417-423)sherpa-onnx/c-api/cxx-api.h (1)
OfflineCanaryModelConfig(226-232)
go-api-examples/non-streaming-canary-decode-files/main.go (1)
scripts/go/sherpa_onnx.go (5)
OfflineRecognizerConfig(485-501)NewOfflineRecognizer(787-799)DeleteOfflineRecognizer(780-783)NewOfflineStream(817-821)DeleteOfflineStream(810-813)
scripts/go/sherpa_onnx.go (1)
sherpa-onnx/c-api/cxx-api.h (1)
OfflineCanaryModelConfig(226-232)
🪛 Shellcheck (0.10.0)
.github/scripts/test-dot-net.sh
[warning] 9-9: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
🔇 Additional comments (42)
pascal-api-examples/non-streaming-asr/.gitignore (1)
13-13: Addition looks correct – nothing to flag.Ignoring the generated
nemo_canaryartefacts is consistent with the rest of the rules.dotnet-examples/non-streaming-canary-decode-files/non-streaming-canary-decode-files.csproj (1)
11-13: Project reference path seems correct – no issues.scripts/go/_internal/non-streaming-canary-decode-files/go.mod (1)
1-5: Local replace directive OK for CI, but breaks external consumers.
replace … => ../forces a relative path that only exists inside this repo.
If someone imports this example outside the mono-repo the build will fail.
Consider pushing the forked module or guard the replace with build tags..github/workflows/test-go.yaml (1)
152-164: Missing library availability check on non-Windows runners.The NeMo Canary test runs unconditionally, but only Windows runners receive
cpof the built.so/.dylib/.dllinto
scripts/go/_internal/non-streaming-canary-decode-files/.Ensure Linux/macOS have the shared libs in
LD_LIBRARY_PATH
(or replicate the copy logic) to avoid runtime linker errors.sherpa-onnx/csrc/online-transducer-nemo-model.cc (1)
326-327: Graceful handling of missingnormalize_typemetadata looks good.Switching to
SHERPA_ONNX_READ_META_DATA_STR_ALLOW_EMPTYprevents hard-failures
on older models that omit this field while retaining strictness for other
mandatory entries.scripts/dotnet/OfflineRecognizer.cs (1)
73-74: P/Invoke declaration follows existing patterns.The native function declaration is consistent with other P/Invoke methods in the class.
scripts/dotnet/OfflineModelConfig.cs (2)
31-31: Canary field initialization follows established patterns.The initialization of the Canary field in the constructor is consistent with other model configurations.
66-66: Field declaration is properly positioned and typed.The Canary field declaration follows the same pattern as other model configuration fields in the struct.
go-api-examples/non-streaming-canary-decode-files/go.mod (1)
1-17: Go module definition is well-structured and dependencies are appropriate.The module definition follows Go conventions and includes the necessary dependencies for the speech recognition example. The platform-specific indirect dependencies ensure cross-platform compatibility.
go-api-examples/non-streaming-canary-decode-files/run.sh (1)
1-14: LGTM! Script follows standard patterns.The script correctly handles model downloading, dependency management, and execution. The use of
set -exensures proper error handling and debugging output.dotnet-examples/non-streaming-canary-decode-files/Program.cs (2)
15-22: Good configuration setup for NeMo Canary model.The configuration correctly initializes all required paths and language settings for the Canary model.
32-40: Excellent demonstration of dynamic language switching.This code effectively shows how to use the
SetConfigmethod to change the target language at runtime, which is a key feature of the NeMo Canary model.dotnet-examples/sherpa-onnx.sln (2)
42-43: Proper project addition to solution.The new project entry follows the standard format and includes the correct GUID reference.
122-125: Complete build configuration setup.The project configurations are properly added for both Debug and Release builds, maintaining consistency with other projects in the solution.
scripts/go/sherpa_onnx.go (3)
417-423: Well-structured Canary model configuration.The
OfflineCanaryModelConfigstruct correctly defines all required fields for the NeMo Canary model, including encoder, decoder, language settings, and punctuation control.
559-563: Proper C string allocation for Canary config.The implementation correctly allocates C strings for all Canary model configuration fields, following the established pattern used for other model configurations.
693-711: Excellent memory management for Canary config.The cleanup code properly frees all allocated C strings and sets pointers to nil, preventing memory leaks. The implementation follows the same defensive pattern used throughout the codebase.
dart-api-examples/non-streaming-asr/run-nemo-canary.sh (2)
13-22: Comprehensive English source language testing.The script thoroughly tests translation from English to multiple target languages (English, German, Spanish, French), demonstrating the model's multilingual capabilities.
24-33: Good coverage of German source language scenarios.The additional test cases with German as the source language ensure the model works correctly in both directions, providing robust validation of the Canary model's multilingual ASR functionality.
pascal-api-examples/non-streaming-asr/nemo_canary.pas (2)
100-107: LGTM - Proper resource cleanup implementation.The resource cleanup section is well-implemented with proper comments explaining when it's necessary. The use of
FreeAndNilis appropriate for Pascal.
82-84: Dynamic reconfiguration via SetConfig is consistent across examples
Verified that Go, Dart and C# samples all updateTgtLang, callSetConfigon the same recognizer and reuse the existing stream before Decode. The Pascal code mirrors this pattern and requires no changes.dart-api-examples/non-streaming-asr/bin/nemo-canary.dart (1)
21-29: LGTM - Comprehensive argument validation.The command-line argument validation is thorough and provides clear error messages with usage instructions.
pascal-api-examples/non-streaming-asr/run-nemo-canary.sh (2)
10-25: LGTM - Proper build system integration.The conditional build logic correctly checks for existing libraries across different platforms (macOS, Linux, Windows) and only rebuilds when necessary.
33-37: LGTM - Correct Pascal compilation flags.The Free Pascal compilation command uses appropriate flags for shared library support and correct include paths.
go-api-examples/non-streaming-canary-decode-files/main.go (2)
17-25: LGTM - Well-structured configuration setup.The configuration initialization is clear and well-organized with appropriate default values.
96-113: LGTM - Robust audio sample conversion.The
samplesInt16ToFloatfunction properly handles byte order conversion and normalization with appropriate error handling.flutter/sherpa_onnx/lib/src/sherpa_onnx_bindings.dart (5)
283-291: LGTM - Proper FFI struct definition.The
SherpaOnnxOfflineCanaryModelConfigstruct is correctly defined with appropriate FFI annotations and follows the established patterns in the codebase.
351-351: LGTM - Consistent integration into model config.The addition of the
canaryfield toSherpaOnnxOfflineModelConfigfollows the established pattern for other model configurations.
890-896: LGTM - Proper function binding definitions.The
OfflineRecognizerSetConfigtypedef definitions correctly establish the FFI function signatures for both native and Dart sides.
1363-1363: LGTM - Consistent binding integration.The
offlineRecognizerSetConfigstatic function pointer follows the established pattern for other FFI bindings.
1764-1767: LGTM - Proper dynamic library binding.The binding initialization correctly loads the
SherpaOnnxOfflineRecognizerSetConfigfunction from the native library and follows the established pattern.sherpa-onnx/pascal-api/sherpa_onnx.pas (5)
302-310: LGTM!The new
TSherpaOnnxOfflineCanaryModelConfigrecord is well-structured with appropriate fields for the NeMo Canary model configuration.
1590-1601: LGTM!The
ToStringmethod properly formats all fields including the booleanUsePncfield.
1667-1675: LGTM!The
ToStringmethod is correctly updated to include the new Canary field.
1976-1981: LGTM!The default values are appropriate for the Canary model configuration.
1748-1752: LGTM!The Canary fields are properly converted with correct type conversions, including the boolean to integer conversion for
UsePnc.flutter/sherpa_onnx/lib/src/offline_recognizer.dart (6)
166-202: LGTM!The
OfflineCanaryModelConfigclass is well-implemented with proper JSON serialization, default values, and string representation.
351-351: LGTM!The integration of the
canaryfield intoOfflineModelConfigis complete and consistent with other model configurations.Also applies to: 404-407, 421-421, 435-435, 456-456
598-606: LGTM!The refactored factory constructor properly separates config conversion and cleanup into static methods, improving code organization.
608-615: Verify the design decision not to update the internal config.The method doesn't update
this.configafter calling the native function. This could lead to confusion if users expect theconfigproperty to reflect the latest configuration.Is this design decision intentional? Consider documenting this behavior in the method's documentation to avoid confusion.
678-682: LGTM!The Canary fields are properly converted to native UTF8 strings with correct boolean-to-integer conversion for the FFI interface.
715-755: LGTM!The
freeConfigmethod properly frees all allocated memory including the new Canary fields, preventing memory leaks.
Summary by CodeRabbit