Skip to content

Conversation

@csukuangfj
Copy link
Collaborator

@csukuangfj csukuangfj commented Jul 10, 2025

Summary by CodeRabbit

  • New Features
    • Added support for NeMo Canary offline speech recognition models across Dart, Go, Pascal, and .NET examples.
    • Introduced new example programs and scripts demonstrating non-streaming ASR with NeMo Canary models in Dart, Go, Pascal, and C#.
    • Enabled dynamic configuration and language switching for offline recognizers in supported languages.
    • Added runtime configuration update capability to offline recognizers in Dart, Pascal, .NET, and Go APIs.
  • Chores
    • Updated test scripts and CI workflows to include NeMo Canary model tests and automated setup for new examples.
    • Adjusted .gitignore and project files to support new examples and dependencies.
  • Bug Fixes
    • Improved model metadata handling to allow empty normalization type fields.

@csukuangfj csukuangfj requested a review from Copilot July 10, 2025 06:39
@coderabbitai
Copy link

coderabbitai bot commented Jul 10, 2025

Walkthrough

This update introduces NeMo Canary model support for offline speech recognition across Dart, Go, C#, and Pascal APIs. New example programs, configuration structures, and workflow/test scripts are added for each language. The core libraries and bindings are extended to support a new "canary" model configuration, including dynamic reconfiguration of recognizer settings at runtime.

Changes

File(s) Change Summary
dart-api-examples/non-streaming-asr/bin/nemo-canary.dart,
dart-api-examples/non-streaming-asr/run-nemo-canary.sh
Added Dart example and script for NeMo Canary offline ASR with multi-language support and model download logic.
dotnet-examples/non-streaming-canary-decode-files/Program.cs,
dotnet-examples/non-streaming-canary-decode-files/non-streaming-canary-decode-files.csproj,
dotnet-examples/non-streaming-canary-decode-files/run.sh
Added C# example, project file, and script for NeMo Canary offline ASR, including model download and multi-language decoding.
go-api-examples/non-streaming-canary-decode-files/main.go,
go-api-examples/non-streaming-canary-decode-files/go.mod,
go-api-examples/non-streaming-canary-decode-files/run.sh
Added Go example, module file, and script for NeMo Canary offline ASR with model download and language switching.
pascal-api-examples/non-streaming-asr/nemo_canary.pas,
pascal-api-examples/non-streaming-asr/run-nemo-canary.sh,
pascal-api-examples/non-streaming-asr/.gitignore
Added Pascal example, script, and gitignore rule for NeMo Canary offline ASR.
flutter/sherpa_onnx/lib/src/offline_recognizer.dart,
flutter/sherpa_onnx/lib/src/sherpa_onnx_bindings.dart
Extended Dart FFI bindings and recognizer to support Canary config and dynamic reconfiguration.
scripts/dotnet/OfflineCanaryModelConfig.cs,
scripts/dotnet/OfflineModelConfig.cs,
scripts/dotnet/OfflineRecognizer.cs
Added C# Canary config struct, integrated into model config, and enabled runtime config updates.
scripts/go/sherpa_onnx.go Added Go Canary config struct, integrated into model config, and updated C string allocation/free logic.
scripts/go/_internal/non-streaming-canary-decode-files/main.go,
scripts/go/_internal/non-streaming-canary-decode-files/go.mod,
scripts/go/_internal/non-streaming-canary-decode-files/run.sh
Added Go internal test harness and script for Canary example.
sherpa-onnx/pascal-api/sherpa_onnx.pas Added Pascal Canary config record, updated model config, and enabled runtime config updates.
sherpa-onnx/csrc/online-transducer-nemo-model.cc Allowed empty "normalize_type" metadata in encoder model initialization.
dotnet-examples/sherpa-onnx.sln Added new project for non-streaming-canary-decode-files to the .NET solution.
.github/scripts/test-dot-net.sh Added test step for non-streaming-canary-decode-files in .NET CI script.
.github/workflows/pascal.yaml Added Nemo Canary test execution and cleanup to Pascal workflow.
.github/workflows/test-go-package.yaml Added Nemo Canary ASR test step for Go package workflow (non-Windows).
.github/workflows/test-go.yaml Added Nemo Canary ASR test step for Go workflow (Windows), including DLL copy and cleanup.

Sequence Diagram(s)

Example: High-Level Flow for Multi-language NeMo Canary Offline ASR

sequenceDiagram
    participant User
    participant Script (run.sh/run-nemo-canary.sh)
    participant Example Program (Dart/Go/C#/Pascal)
    participant SherpaOnnx Library

    User->>Script: Start script
    Script->>Script: Download/check NeMo Canary model
    Script->>Example Program: Run example with config (srcLang, tgtLang, wav)
    Example Program->>SherpaOnnx Library: Initialize recognizer with config
    Example Program->>SherpaOnnx Library: Feed waveform, decode
    SherpaOnnx Library-->>Example Program: Return transcription
    Example Program->>Script: Output result
    Script->>Example Program: (Optional) Re-run with different tgtLang
    Example Program->>SherpaOnnx Library: Update config (setConfig)
    Example Program->>SherpaOnnx Library: Feed waveform, decode
    SherpaOnnx Library-->>Example Program: Return new transcription
    Example Program->>Script: Output new result
    Script->>Script: Cleanup model files
Loading

Poem

Oh, what a hop through code and lore,
Canary models now at every door!
Dart, Go, C#, and Pascal too—
Multilingual bunnies, we welcome you.
With scripts and configs, tests anew,
🐰✨ The Sherpa hops—now dreams come true!


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 730db98 and 1eb1e46.

📒 Files selected for processing (1)
  • sherpa-onnx/pascal-api/sherpa_onnx.pas (12 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • sherpa-onnx/pascal-api/sherpa_onnx.pas

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds support for the new NeMo Canary ASR model across multiple language bindings by introducing a Canary model configuration and setter method on the offline recognizer.

  • Define Canary model config in Pascal, Go, C#, Dart and update converter functions
  • Add SetConfig API for offline recognizer (Pascal, Go, C#, Dart)
  • Extend CI/workflows and example scripts to test non-streaming Canary decoding

Reviewed Changes

Copilot reviewed 26 out of 27 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
sherpa-onnx/pascal-api/sherpa_onnx.pas Added Pascal record for Canary config and SetConfig
sherpa-onnx/csrc/online-transducer-nemo-model.cc Allow empty normalize_type
scripts/go/sherpa_onnx.go Added Go struct and marshalling for Canary config
scripts/go/_internal/non-streaming-canary-decode-files/* Example scaffold for internal Canary decode tests
scripts/go/.../non-streaming-canary-decode-files/* Go example and CI script for Canary decode
scripts/dotnet/*.cs Added C# OfflineCanaryModelConfig and SetConfig
flutter/sherpa_onnx/lib/src/{sherpa_onnx_bindings.dart, offline_recognizer.dart} Added Dart FFI and high-level Canary config/setConfig
† CI workflows (.github) Added CI steps to build/test Canary examples
Comments suppressed due to low confidence (1)

sherpa-onnx/pascal-api/sherpa_onnx.pas:1701

  • There's a typo in the function name: "Recongizer" should be spelled "Recognizer" for consistency.
function ConvertOfflineRecongizerConfig(Config: TSherpaOnnxOfflineRecognizerConfig): SherpaOnnxOfflineRecognizerConfig;

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🔭 Outside diff range comments (5)
scripts/go/_internal/non-streaming-canary-decode-files/run.sh (2)

1-14: Script lacks she-bang and contains an un-guarded path line ⇒ will error out.

../../../../go-api-examples/non-streaming-canary-decode-files/run.sh on line 1 is executed as a shell command instead of being sourced/commented.
Moreover, the file has no #!/usr/bin/env bash, so behaviour depends on whichever shell runs it.

-../../../../go-api-examples/non-streaming-canary-decode-files/run.sh
+#!/usr/bin/env bash
+set -euo pipefail
+
+# Delegate to the original script. `exec` keeps the same PID and exits with
+# its status.
+exec "$(dirname "$0")/../../../../go-api-examples/non-streaming-canary-decode-files/run.sh"

If you really need the local copy of the commands, drop the path line entirely.


3-14: Duplicate logic – keep a single source of truth.

Lines 3-14 replicate almost the same model-download / build steps as the upstream script referenced on line 1. When the upstream script changes this copy will silently diverge. Prefer exec-ing or sourcing the original script instead.

scripts/go/_internal/non-streaming-canary-decode-files/main.go (2)

1-4: File won’t compile – missing package declaration and stray path.

The first line is a bare path and there is no
package main statement, so go build fails with expected ‘package’, found “../../…”.

-../../../../go-api-examples/non-streaming-canary-decode-files/main.go
+package main
+
+// (If this wrapper is only meant to re-export the real file, consider deleting
+// everything below and using `//go:build ignore` to avoid compilation.)

61-63: Error from os.Open is silently ignored.

file, _ := os.Open(filename)

Swallowing the error may lead to a nil pointer dereference later. Capture and handle it:

-file, _ := os.Open(filename)
+file, err := os.Open(filename)
+if err != nil {
+    log.Fatalf("Failed to open %s: %v", filename, err)
+}
sherpa-onnx/pascal-api/sherpa_onnx.pas (1)

1701-1770: Fix the typo in the function name.

The function name contains a typo: "Recongizer" should be "Recognizer".

-function ConvertOfflineRecongizerConfig(Config: TSherpaOnnxOfflineRecognizerConfig): SherpaOnnxOfflineRecognizerConfig;
+function ConvertOfflineRecognizerConfig(Config: TSherpaOnnxOfflineRecognizerConfig): SherpaOnnxOfflineRecognizerConfig;
🧹 Nitpick comments (10)
scripts/go/_internal/non-streaming-canary-decode-files/main.go (2)

65-69: Log message hides the root cause.

log.Fatalf("Failed to read wave format") discards err. Include it for easier debugging:

-    log.Fatalf("Failed to read wave format")
+    log.Fatalf("Failed to read wave format: %v", err)

100-110: Inefficient per-sample allocation in inner loop.

Creating a new bytes.Reader on every sample causes ~80 × allocations per second of audio. Decode the whole buffer in a tight loop without allocs, e.g.:

for i := 0; i < numSamples; i++ {
    s16 := int16(binary.LittleEndian.Uint16(inSamples[i*2:]))
    outSamples[i] = float32(s16) / 32768
}
dotnet-examples/non-streaming-canary-decode-files/non-streaming-canary-decode-files.csproj (1)

4-9: Consider Pascal-case root namespace.

non_streaming_canary_decode_files breaks the usual .NET naming guidelines.
NonStreamingCanaryDecodeFiles would read better and avoid the underscore.

.github/workflows/pascal.yaml (1)

159-162: Add explicit failure-propagation when invoking run-nemo-canary.sh.

./run-nemo-canary.sh is executed directly.
If that script ever forgets set -e (or gets committed without the exec bit), the workflow would silently continue and mask failures.

-          ./run-nemo-canary.sh
+          bash -eo pipefail ./run-nemo-canary.sh

This guarantees the CI job aborts on any unexpected error inside the script.

.github/workflows/test-go.yaml (1)

111-112: Avoid repetitive cp statements – keep the list DRY.

You now have 15 identical cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll … lines.
Consider looping to reduce copy-paste maintenance:

-for dir in add-punctuation audio-tagging keyword-spotting-from-file non-streaming-canary-decode-files \
-           non-streaming-decode-files non-streaming-speaker-diarization non-streaming-tts \
-           speaker-identification speech-enhancement-gtcrn streaming-decode-files \
-           streaming-hlg-decoding vad vad-asr-paraformer vad-asr-whisper \
-           vad-speaker-identification vad-spoken-language-identification; do
-  cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/$dir/
-done
+for dir in $(printf "%s\n" ../scripts/go/_internal/*/ | sed 's|.*/||'); do
+  cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll "../scripts/go/_internal/${dir}/"
+done

Less duplication → easier future updates.

.github/workflows/test-go-package.yaml (1)

79-86: Keep Go example build steps consistent.

Other examples run go mod tidy && go build before ./run.sh; this step skips
those, relying on the script to build. For consistency and faster failure
detection:

-          cd go-api-examples/non-streaming-canary-decode-files
-          ./run.sh
+          cd go-api-examples/non-streaming-canary-decode-files
+          go mod tidy
+          go build
+          ./run.sh

Minor, but keeps the workflow uniform.

dotnet-examples/non-streaming-canary-decode-files/run.sh (1)

5-11: Pin build configuration and guard against partial downloads.

  1. tar xvf auto-detects compression on GNU tar but fails on BSD tar/macOS
    without -j.
  2. dotnet run defaults to Debug; Release artefacts are smaller & faster.
-  tar xvf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2
+  tar xjvf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2
...
-dotnet run
+dotnet run --configuration Release --no-restore

Also consider verifying checksum (e.g. sha256sum) before extraction for extra safety.

scripts/dotnet/OfflineRecognizer.cs (1)

17-20: Consider adding error handling and parameter validation.

The SetConfig method implementation is correct but lacks error handling. Consider adding null checks and exception handling for potential native call failures.

 public void SetConfig(OfflineRecognizerConfig config)
 {
+    if (_handle.Handle == IntPtr.Zero)
+        throw new ObjectDisposedException(nameof(OfflineRecognizer));
+    
     SherpaOnnxOfflineRecognizerSetConfig(_handle.Handle, ref config);
 }
scripts/dotnet/OfflineCanaryModelConfig.cs (1)

7-32: Well-designed struct for interop with appropriate marshaling attributes.

The struct definition is excellent for P/Invoke scenarios:

  • Sequential layout ensures proper memory alignment
  • String marshaling attributes are correctly applied
  • Default values are sensible (English language, punctuation enabled)
  • Field ordering matches the native counterpart

Consider adding XML documentation comments to describe the purpose of each field.

dart-api-examples/non-streaming-asr/bin/nemo-canary.dart (1)

64-70: Consider using a more robust configuration update approach.

The JSON manipulation for updating the target language is clever but fragile. Consider using a more type-safe approach.

-    var json = config.toJson();
-    ((json['model'] as Map<String, dynamic>)!['canary']
-        as Map<String, dynamic>)!['tgtLang'] = 'en';
-    config = sherpa_onnx.OfflineRecognizerConfig.fromJson(json);
+    // More robust approach using constructor
+    final newCanary = sherpa_onnx.OfflineCanaryModelConfig(
+      encoder: canary.encoder,
+      decoder: canary.decoder,
+      srcLang: canary.srcLang,
+      tgtLang: 'en',
+    );
+    final newModelConfig = sherpa_onnx.OfflineModelConfig(
+      canary: newCanary,
+      tokens: modelConfig.tokens,
+      debug: modelConfig.debug,
+      numThreads: modelConfig.numThreads,
+    );
+    config = sherpa_onnx.OfflineRecognizerConfig(model: newModelConfig);
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e2b2d5e and 730db98.

📒 Files selected for processing (27)
  • .github/scripts/test-dot-net.sh (1 hunks)
  • .github/workflows/pascal.yaml (1 hunks)
  • .github/workflows/test-go-package.yaml (1 hunks)
  • .github/workflows/test-go.yaml (2 hunks)
  • dart-api-examples/non-streaming-asr/bin/nemo-canary.dart (1 hunks)
  • dart-api-examples/non-streaming-asr/run-nemo-canary.sh (1 hunks)
  • dotnet-examples/non-streaming-canary-decode-files/Program.cs (1 hunks)
  • dotnet-examples/non-streaming-canary-decode-files/non-streaming-canary-decode-files.csproj (1 hunks)
  • dotnet-examples/non-streaming-canary-decode-files/run.sh (1 hunks)
  • dotnet-examples/sherpa-onnx.sln (2 hunks)
  • flutter/sherpa_onnx/lib/src/offline_recognizer.dart (10 hunks)
  • flutter/sherpa_onnx/lib/src/sherpa_onnx_bindings.dart (5 hunks)
  • go-api-examples/non-streaming-canary-decode-files/go.mod (1 hunks)
  • go-api-examples/non-streaming-canary-decode-files/main.go (1 hunks)
  • go-api-examples/non-streaming-canary-decode-files/run.sh (1 hunks)
  • pascal-api-examples/non-streaming-asr/.gitignore (1 hunks)
  • pascal-api-examples/non-streaming-asr/nemo_canary.pas (1 hunks)
  • pascal-api-examples/non-streaming-asr/run-nemo-canary.sh (1 hunks)
  • scripts/dotnet/OfflineCanaryModelConfig.cs (1 hunks)
  • scripts/dotnet/OfflineModelConfig.cs (2 hunks)
  • scripts/dotnet/OfflineRecognizer.cs (2 hunks)
  • scripts/go/_internal/non-streaming-canary-decode-files/go.mod (1 hunks)
  • scripts/go/_internal/non-streaming-canary-decode-files/main.go (1 hunks)
  • scripts/go/_internal/non-streaming-canary-decode-files/run.sh (1 hunks)
  • scripts/go/sherpa_onnx.go (4 hunks)
  • sherpa-onnx/csrc/online-transducer-nemo-model.cc (1 hunks)
  • sherpa-onnx/pascal-api/sherpa_onnx.pas (12 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
scripts/dotnet/OfflineModelConfig.cs (3)
scripts/dotnet/OfflineCanaryModelConfig.cs (1)
  • OfflineCanaryModelConfig (10-17)
scripts/go/sherpa_onnx.go (1)
  • OfflineCanaryModelConfig (417-423)
sherpa-onnx/c-api/cxx-api.h (1)
  • OfflineCanaryModelConfig (226-232)
go-api-examples/non-streaming-canary-decode-files/main.go (1)
scripts/go/sherpa_onnx.go (5)
  • OfflineRecognizerConfig (485-501)
  • NewOfflineRecognizer (787-799)
  • DeleteOfflineRecognizer (780-783)
  • NewOfflineStream (817-821)
  • DeleteOfflineStream (810-813)
scripts/go/sherpa_onnx.go (1)
sherpa-onnx/c-api/cxx-api.h (1)
  • OfflineCanaryModelConfig (226-232)
🪛 Shellcheck (0.10.0)
.github/scripts/test-dot-net.sh

[warning] 9-9: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)

🔇 Additional comments (42)
pascal-api-examples/non-streaming-asr/.gitignore (1)

13-13: Addition looks correct – nothing to flag.

Ignoring the generated nemo_canary artefacts is consistent with the rest of the rules.

dotnet-examples/non-streaming-canary-decode-files/non-streaming-canary-decode-files.csproj (1)

11-13: Project reference path seems correct – no issues.

scripts/go/_internal/non-streaming-canary-decode-files/go.mod (1)

1-5: Local replace directive OK for CI, but breaks external consumers.

replace … => ../ forces a relative path that only exists inside this repo.
If someone imports this example outside the mono-repo the build will fail.
Consider pushing the forked module or guard the replace with build tags.

.github/workflows/test-go.yaml (1)

152-164: Missing library availability check on non-Windows runners.

The NeMo Canary test runs unconditionally, but only Windows runners receive
cp of the built .so/.dylib/.dll into
scripts/go/_internal/non-streaming-canary-decode-files/.

Ensure Linux/macOS have the shared libs in LD_LIBRARY_PATH
(or replicate the copy logic) to avoid runtime linker errors.

sherpa-onnx/csrc/online-transducer-nemo-model.cc (1)

326-327: Graceful handling of missing normalize_type metadata looks good.

Switching to SHERPA_ONNX_READ_META_DATA_STR_ALLOW_EMPTY prevents hard-failures
on older models that omit this field while retaining strictness for other
mandatory entries.

scripts/dotnet/OfflineRecognizer.cs (1)

73-74: P/Invoke declaration follows existing patterns.

The native function declaration is consistent with other P/Invoke methods in the class.

scripts/dotnet/OfflineModelConfig.cs (2)

31-31: Canary field initialization follows established patterns.

The initialization of the Canary field in the constructor is consistent with other model configurations.


66-66: Field declaration is properly positioned and typed.

The Canary field declaration follows the same pattern as other model configuration fields in the struct.

go-api-examples/non-streaming-canary-decode-files/go.mod (1)

1-17: Go module definition is well-structured and dependencies are appropriate.

The module definition follows Go conventions and includes the necessary dependencies for the speech recognition example. The platform-specific indirect dependencies ensure cross-platform compatibility.

go-api-examples/non-streaming-canary-decode-files/run.sh (1)

1-14: LGTM! Script follows standard patterns.

The script correctly handles model downloading, dependency management, and execution. The use of set -ex ensures proper error handling and debugging output.

dotnet-examples/non-streaming-canary-decode-files/Program.cs (2)

15-22: Good configuration setup for NeMo Canary model.

The configuration correctly initializes all required paths and language settings for the Canary model.


32-40: Excellent demonstration of dynamic language switching.

This code effectively shows how to use the SetConfig method to change the target language at runtime, which is a key feature of the NeMo Canary model.

dotnet-examples/sherpa-onnx.sln (2)

42-43: Proper project addition to solution.

The new project entry follows the standard format and includes the correct GUID reference.


122-125: Complete build configuration setup.

The project configurations are properly added for both Debug and Release builds, maintaining consistency with other projects in the solution.

scripts/go/sherpa_onnx.go (3)

417-423: Well-structured Canary model configuration.

The OfflineCanaryModelConfig struct correctly defines all required fields for the NeMo Canary model, including encoder, decoder, language settings, and punctuation control.


559-563: Proper C string allocation for Canary config.

The implementation correctly allocates C strings for all Canary model configuration fields, following the established pattern used for other model configurations.


693-711: Excellent memory management for Canary config.

The cleanup code properly frees all allocated C strings and sets pointers to nil, preventing memory leaks. The implementation follows the same defensive pattern used throughout the codebase.

dart-api-examples/non-streaming-asr/run-nemo-canary.sh (2)

13-22: Comprehensive English source language testing.

The script thoroughly tests translation from English to multiple target languages (English, German, Spanish, French), demonstrating the model's multilingual capabilities.


24-33: Good coverage of German source language scenarios.

The additional test cases with German as the source language ensure the model works correctly in both directions, providing robust validation of the Canary model's multilingual ASR functionality.

pascal-api-examples/non-streaming-asr/nemo_canary.pas (2)

100-107: LGTM - Proper resource cleanup implementation.

The resource cleanup section is well-implemented with proper comments explaining when it's necessary. The use of FreeAndNil is appropriate for Pascal.


82-84: Dynamic reconfiguration via SetConfig is consistent across examples
Verified that Go, Dart and C# samples all update TgtLang, call SetConfig on the same recognizer and reuse the existing stream before Decode. The Pascal code mirrors this pattern and requires no changes.

dart-api-examples/non-streaming-asr/bin/nemo-canary.dart (1)

21-29: LGTM - Comprehensive argument validation.

The command-line argument validation is thorough and provides clear error messages with usage instructions.

pascal-api-examples/non-streaming-asr/run-nemo-canary.sh (2)

10-25: LGTM - Proper build system integration.

The conditional build logic correctly checks for existing libraries across different platforms (macOS, Linux, Windows) and only rebuilds when necessary.


33-37: LGTM - Correct Pascal compilation flags.

The Free Pascal compilation command uses appropriate flags for shared library support and correct include paths.

go-api-examples/non-streaming-canary-decode-files/main.go (2)

17-25: LGTM - Well-structured configuration setup.

The configuration initialization is clear and well-organized with appropriate default values.


96-113: LGTM - Robust audio sample conversion.

The samplesInt16ToFloat function properly handles byte order conversion and normalization with appropriate error handling.

flutter/sherpa_onnx/lib/src/sherpa_onnx_bindings.dart (5)

283-291: LGTM - Proper FFI struct definition.

The SherpaOnnxOfflineCanaryModelConfig struct is correctly defined with appropriate FFI annotations and follows the established patterns in the codebase.


351-351: LGTM - Consistent integration into model config.

The addition of the canary field to SherpaOnnxOfflineModelConfig follows the established pattern for other model configurations.


890-896: LGTM - Proper function binding definitions.

The OfflineRecognizerSetConfig typedef definitions correctly establish the FFI function signatures for both native and Dart sides.


1363-1363: LGTM - Consistent binding integration.

The offlineRecognizerSetConfig static function pointer follows the established pattern for other FFI bindings.


1764-1767: LGTM - Proper dynamic library binding.

The binding initialization correctly loads the SherpaOnnxOfflineRecognizerSetConfig function from the native library and follows the established pattern.

sherpa-onnx/pascal-api/sherpa_onnx.pas (5)

302-310: LGTM!

The new TSherpaOnnxOfflineCanaryModelConfig record is well-structured with appropriate fields for the NeMo Canary model configuration.


1590-1601: LGTM!

The ToString method properly formats all fields including the boolean UsePnc field.


1667-1675: LGTM!

The ToString method is correctly updated to include the new Canary field.


1976-1981: LGTM!

The default values are appropriate for the Canary model configuration.


1748-1752: LGTM!

The Canary fields are properly converted with correct type conversions, including the boolean to integer conversion for UsePnc.

flutter/sherpa_onnx/lib/src/offline_recognizer.dart (6)

166-202: LGTM!

The OfflineCanaryModelConfig class is well-implemented with proper JSON serialization, default values, and string representation.


351-351: LGTM!

The integration of the canary field into OfflineModelConfig is complete and consistent with other model configurations.

Also applies to: 404-407, 421-421, 435-435, 456-456


598-606: LGTM!

The refactored factory constructor properly separates config conversion and cleanup into static methods, improving code organization.


608-615: Verify the design decision not to update the internal config.

The method doesn't update this.config after calling the native function. This could lead to confusion if users expect the config property to reflect the latest configuration.

Is this design decision intentional? Consider documenting this behavior in the method's documentation to avoid confusion.


678-682: LGTM!

The Canary fields are properly converted to native UTF8 strings with correct boolean-to-integer conversion for the FFI interface.


715-755: LGTM!

The freeConfig method properly frees all allocated memory including the new Canary fields, preventing memory leaks.

@csukuangfj csukuangfj merged commit fd9a687 into k2-fsa:master Jul 10, 2025
31 of 231 checks passed
@csukuangfj csukuangfj deleted the canary-api branch July 10, 2025 06:53
@coderabbitai coderabbitai bot mentioned this pull request Aug 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant