Skip to content

Conversation

csukuangfj
Copy link
Collaborator

@csukuangfj csukuangfj commented Aug 7, 2025

To avoid the following case reported by one of our users.

e379d3b0e2e5b3c0bbdccf34987fe27a

Summary by CodeRabbit

  • New Features

    • Improved decoding process to consider the number of feature frames for more accurate results.
  • Bug Fixes

    • Adjusted decoding loop limits to better align with the input length, enhancing performance and reliability.

Copy link

coderabbitai bot commented Aug 7, 2025

Walkthrough

The Decode method in the ASR decoder classes and their implementations was updated to accept an additional integer parameter representing the number of feature frames. Corresponding method signatures and invocations were updated, and the decoding loop now uses this parameter to determine the maximum number of tokens processed.

Changes

Cohort / File(s) Change Summary
Decoder Interface Update
sherpa-onnx/csrc/offline-fire-red-asr-decoder.h, sherpa-onnx/csrc/offline-fire-red-asr-greedy-search-decoder.h
Added int32_t num_feature_frames parameter to the Decode method signatures in both base and derived classes.
Decoder Implementation
sherpa-onnx/csrc/offline-fire-red-asr-greedy-search-decoder.cc
Modified Decode implementation to use num_feature_frames for calculating the decoding loop bound.
Recognizer Integration
sherpa-onnx/csrc/offline-recognizer-fire-red-asr-impl.h
Updated call to Decode to pass the new num_frames argument.

Sequence Diagram(s)

sequenceDiagram
    participant RecognizerImpl
    participant Decoder
    participant GreedySearchDecoder

    RecognizerImpl->>Decoder: Decode(cross_k, cross_v, num_frames)
    alt GreedySearchDecoder implementation
        Decoder->>GreedySearchDecoder: Decode(cross_k, cross_v, num_feature_frames)
        GreedySearchDecoder->>GreedySearchDecoder: Calculate num_possible_tokens
        GreedySearchDecoder->>GreedySearchDecoder: Run decoding loop (up to num_possible_tokens)
        GreedySearchDecoder-->>Decoder: Return results
    end
    Decoder-->>RecognizerImpl: Return results
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

In the meadow of code, a new frame appears,
Decoders now count, as the logic steers.
With tokens per second, and bounds set anew,
The rabbits rejoice in a feature-frame view!
🐇✨
Each hop in the loop is a leap toward the sun,
This patch is complete—reviewing is fun!

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
sherpa-onnx/csrc/offline-fire-red-asr-decoder.h (1)

34-35: Update documentation to include the new parameter.

The method signature correctly adds the num_feature_frames parameter, but the documentation comment (lines 24-32) should be updated to describe this new parameter.

Apply this diff to update the documentation:

   /** Run beam search given the output from the FireRedAsr encoder model.
    *
    * @param n_layer_cross_k       A 4-D tensor of shape
    *                              (num_decoder_layers, N, T, d_model).
    * @param n_layer_cross_v       A 4-D tensor of shape
    *                              (num_decoder_layers, N, T, d_model).
+   * @param num_feature_frames    The number of feature frames in the input.
    *
    * @return Return a vector of size `N` containing the decoded results.
    */
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 07d6957 and 98ea5fb.

📒 Files selected for processing (4)
  • sherpa-onnx/csrc/offline-fire-red-asr-decoder.h (1 hunks)
  • sherpa-onnx/csrc/offline-fire-red-asr-greedy-search-decoder.cc (2 hunks)
  • sherpa-onnx/csrc/offline-fire-red-asr-greedy-search-decoder.h (1 hunks)
  • sherpa-onnx/csrc/offline-recognizer-fire-red-asr-impl.h (1 hunks)
🔇 Additional comments (2)
sherpa-onnx/csrc/offline-fire-red-asr-greedy-search-decoder.h (1)

21-22: LGTM!

The method signature correctly overrides the base class interface with the new num_feature_frames parameter. The override specifier ensures compile-time verification of the signature match.

sherpa-onnx/csrc/offline-fire-red-asr-greedy-search-decoder.cc (1)

19-20: LGTM!

The method signature correctly implements the updated interface with the new num_feature_frames parameter.

Comment on lines +57 to +62
// assume at most 6 tokens per second
int32_t num_possible_tokens = num_feature_frames / 100 * 6;
num_possible_tokens =
std::min<int32_t>(num_possible_tokens, meta_data.max_len / 2);

for (int32_t i = 0; i < num_possible_tokens; ++i) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve the token calculation logic with better documentation and validation.

The dynamic token limit calculation has several areas for improvement:

  1. The hardcoded constants (100 Hz frame rate, 6 tokens/second) should be documented or made configurable
  2. Missing validation for num_feature_frames
  3. Potential integer overflow for very large frame counts

Apply this diff to improve the implementation:

-  // assume at most 6 tokens per second
-  int32_t num_possible_tokens = num_feature_frames / 100 * 6;
-  num_possible_tokens =
-      std::min<int32_t>(num_possible_tokens, meta_data.max_len / 2);
+  // Calculate maximum tokens based on feature frames
+  // Assumptions: 100 Hz feature frame rate (10ms), max 6 tokens per second
+  int32_t num_possible_tokens = 0;
+  if (num_feature_frames > 0) {
+    // Use int64_t to prevent overflow, then cast back
+    int64_t tokens_estimate = static_cast<int64_t>(num_feature_frames) * 6 / 100;
+    num_possible_tokens = static_cast<int32_t>(
+        std::min(tokens_estimate, static_cast<int64_t>(meta_data.max_len / 2)));
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// assume at most 6 tokens per second
int32_t num_possible_tokens = num_feature_frames / 100 * 6;
num_possible_tokens =
std::min<int32_t>(num_possible_tokens, meta_data.max_len / 2);
for (int32_t i = 0; i < num_possible_tokens; ++i) {
// Calculate maximum tokens based on feature frames
// Assumptions: 100 Hz feature frame rate (10ms), max 6 tokens per second
int32_t num_possible_tokens = 0;
if (num_feature_frames > 0) {
// Use int64_t to prevent overflow, then cast back
int64_t tokens_estimate =
static_cast<int64_t>(num_feature_frames) * 6 / 100;
num_possible_tokens = static_cast<int32_t>(
std::min(tokens_estimate,
static_cast<int64_t>(meta_data.max_len / 2)));
}
for (int32_t i = 0; i < num_possible_tokens; ++i) {
🤖 Prompt for AI Agents
In sherpa-onnx/csrc/offline-fire-red-asr-greedy-search-decoder.cc around lines
57 to 62, improve the token calculation by adding comments explaining the
constants used (100 Hz frame rate and 6 tokens per second) or making them
configurable variables. Add validation to ensure num_feature_frames is positive
before using it in calculations. Also, implement checks or use safe arithmetic
to prevent integer overflow when computing num_possible_tokens for very large
num_feature_frames values.

Comment on lines +122 to +123
auto results = decoder_->Decode(std::move(cross_kv.first),
std::move(cross_kv.second), num_frames);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix potential type mismatch between num_frames and expected parameter type.

The Decode method expects an int32_t num_feature_frames parameter, but num_frames is declared as int64_t on line 109. This could cause issues on systems where these types differ in size.

Apply this diff to fix the type mismatch:

-    int64_t num_frames = f.size() / feat_dim;
+    int32_t num_frames = static_cast<int32_t>(f.size() / feat_dim);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
auto results = decoder_->Decode(std::move(cross_kv.first),
std::move(cross_kv.second), num_frames);
// Compute number of feature frames
int32_t num_frames = static_cast<int32_t>(f.size() / feat_dim);
auto results = decoder_->Decode(std::move(cross_kv.first),
std::move(cross_kv.second), num_frames);
🤖 Prompt for AI Agents
In sherpa-onnx/csrc/offline-recognizer-fire-red-asr-impl.h around lines 109 and
122-123, the variable num_frames is declared as int64_t but passed to Decode
which expects int32_t. Change the declaration of num_frames from int64_t to
int32_t to match the expected parameter type and avoid potential type mismatch
issues.

@csukuangfj csukuangfj merged commit 08aaa89 into k2-fsa:master Aug 7, 2025
178 of 228 checks passed
@csukuangfj csukuangfj deleted the fix-fire-red-asr branch August 7, 2025 09:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant