Skip to content

Conversation

@Aruxxxi
Copy link
Contributor

@Aruxxxi Aruxxxi commented Aug 19, 2025

描述

添加了punctuation的C++ API包装类,提供C++接口。

修改内容

  • cxx-api.h 中添加了 OfflinePunctuation 类声明
  • cxx-api.cc 中实现了 OfflinePunctuation 类的方法
  • 提供了 CreateDestroyAddPunctuation 方法
  • 遵循现有的C++ API设计模式

修改类型

  • 新功能
  • Bug修复
  • 文档更新

测试

  • 已测试
  • 需要测试

检查清单

  • 代码符合项目规范
  • 添加了必要的测试
  • 更新了相关文档
  • 遵循现有的C++ API设计模式

Summary by CodeRabbit

  • New Features
    • Introduced offline punctuation in the C++ API, enabling users to add punctuation to raw text without an internet connection.
    • Provides a simple creation and lifecycle for a punctuation processor, plus a method to process text and return punctuated output.
    • Includes configurable options such as model selection, number of threads, debug mode, and execution provider (e.g., CPU).
    • Enhances text post-processing workflows for transcription and NLP applications.

@coderabbitai
Copy link

coderabbitai bot commented Aug 19, 2025

Walkthrough

Adds a C++ wrapper (sherpa_onnx::cxx) for offline punctuation, introducing config structs and an OfflinePunctuation class with a factory Create, AddPunctuation operation, and Destroy, bridging to existing C APIs and handling allocation/freeing of C strings.

Changes

Cohort / File(s) Summary
C++ API header additions
sherpa-onnx/c-api/cxx-api.h
Adds OfflinePunctuationModelConfig, OfflinePunctuationConfig, and OfflinePunctuation class (Create, AddPunctuation, Destroy; private ctor from C pointer).
C++ API implementation
sherpa-onnx/c-api/cxx-api.cc
Implements wrapper: builds SherpaOnnxOfflinePunctuationConfig, calls SherpaOnnxCreateOfflinePunctuation; defines ctor from C pointer; AddPunctuation invoking SherpaOfflinePunctuationAddPunct and freeing result; Destroy forwarding to SherpaOnnxDestroyOfflinePunctuation.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant CXX as cxx::OfflinePunctuation
    participant CAPI as C API (sherpa-onnx)

    rect rgb(240,245,255)
    note over App,CXX: Construction
    App->>CXX: OfflinePunctuation::Create(config)
    CXX->>CAPI: SherpaOnnxCreateOfflinePunctuation(c_config)
    CAPI-->>CXX: SherpaOnnxOfflinePunctuation*
    CXX-->>App: OfflinePunctuation (wrapper)
    end

    rect rgb(240,255,240)
    note over App,CXX: Inference
    App->>CXX: AddPunctuation(text)
    CXX->>CAPI: SherpaOfflinePunctuationAddPunct(ptr, text)
    CAPI-->>CXX: char* punctuated_text
    CXX-->>CAPI: SherpaOfflinePunctuationFreeText(punctuated_text)
    CXX-->>App: std::string result
    end

    rect rgb(255,245,240)
    note over App,CXX: Teardown
    App->>CXX: Destroy(ptr)
    CXX->>CAPI: SherpaOnnxDestroyOfflinePunctuation(ptr)
    end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Poem

Hopping through code with gentle elation,
I dot every pause with crisp punctuation;
From C to C++ I bridge and I bind,
Commas and periods neatly aligned.
A thump of the paw—strings freed, flow clean—
Carrots for code that’s tidy and lean. 🥕✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@Aruxxxi Aruxxxi closed this Aug 19, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
sherpa-onnx/c-api/cxx-api.h (1)

680-689: Export macro consistency: annotate config structs with SHERPA_ONNX_API

Most Offline config structs in this header are exported (e.g., OfflineTransducerModelConfig). For consistency and to avoid any potential Windows DLL export surprises, annotate the new punctuation config structs with SHERPA_ONNX_API.

Apply:

-struct OfflinePunctuationModelConfig {
+struct SHERPA_ONNX_API OfflinePunctuationModelConfig {
   std::string ct_transformer;
   int32_t num_threads = 1;
   bool debug = false;
   std::string provider = "cpu";
 };
 
-struct OfflinePunctuationConfig {
+struct SHERPA_ONNX_API OfflinePunctuationConfig {
   OfflinePunctuationModelConfig model;
 };
sherpa-onnx/c-api/cxx-api.cc (2)

842-844: Use the parameter in Destroy() for consistency

Other wrappers pass the provided pointer to the C destroy function. Here, p_ is used instead of the parameter. It works (MoveOnly passes p_), but using the parameter improves clarity and consistency.

-void OfflinePunctuation::Destroy(const SherpaOnnxOfflinePunctuation *p) const {
-  SherpaOnnxDestroyOfflinePunctuation(p_);
-}
+void OfflinePunctuation::Destroy(const SherpaOnnxOfflinePunctuation *p) const {
+  SherpaOnnxDestroyOfflinePunctuation(p);
+}

846-851: Defensive check: handle potential null from C API before constructing std::string

While the current C implementation always allocates and returns a non-null char*, guarding against null avoids UB from constructing std::string(nullptr) and aligns with patterns elsewhere that tolerate null returns.

 std::string OfflinePunctuation::AddPunctuation(const std::string &text) const {
-  const char *result = SherpaOfflinePunctuationAddPunct(p_, text.c_str());
-  std::string ans(result);
-  SherpaOfflinePunctuationFreeText(result);
-  return ans;
+  const char *result = SherpaOfflinePunctuationAddPunct(p_, text.c_str());
+  if (!result) {
+    return std::string{};
+  }
+  std::string ans(result);
+  SherpaOfflinePunctuationFreeText(result);
+  return ans;
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e536e6e and b0cc496.

📒 Files selected for processing (2)
  • sherpa-onnx/c-api/cxx-api.cc (1 hunks)
  • sherpa-onnx/c-api/cxx-api.h (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
sherpa-onnx/c-api/cxx-api.h (2)
sherpa-onnx/c-api/cxx-api.cc (19)
  • OfflinePunctuation (839-840)
  • Create (49-111)
  • Create (49-50)
  • Create (287-293)
  • Create (287-288)
  • Create (393-441)
  • Create (393-393)
  • Create (491-533)
  • Create (491-491)
  • Create (606-620)
  • Create (606-607)
  • Create (647-650)
  • Create (647-647)
  • Create (686-712)
  • Create (686-687)
  • Create (776-783)
  • Create (776-779)
  • AddPunctuation (846-851)
  • AddPunctuation (846-846)
sherpa-onnx/csrc/offline-punctuation.cc (1)
  • sherpa_onnx (15-53)
sherpa-onnx/c-api/cxx-api.cc (2)
sherpa-onnx/c-api/c-api.cc (8)
  • SherpaOnnxCreateOfflinePunctuation (1898-1922)
  • SherpaOnnxCreateOfflinePunctuation (1898-1899)
  • SherpaOnnxDestroyOfflinePunctuation (1924-1927)
  • SherpaOnnxDestroyOfflinePunctuation (1924-1925)
  • SherpaOfflinePunctuationAddPunct (1929-1938)
  • SherpaOfflinePunctuationAddPunct (1929-1930)
  • SherpaOfflinePunctuationFreeText (1940-1940)
  • SherpaOfflinePunctuationFreeText (1940-1940)
scripts/go/sherpa_onnx.go (1)
  • OfflinePunctuation (1761-1763)
🔇 Additional comments (2)
sherpa-onnx/c-api/cxx-api.h (1)

691-701: API shape looks good and aligns with existing wrappers

The MoveOnly-based OfflinePunctuation wrapper (Create/Destroy/AddPunctuation) matches established patterns in this header. No issues from an API design standpoint.

sherpa-onnx/c-api/cxx-api.cc (1)

828-837: Create(): good zero-init and faithful config mapping

Zero-initializing SherpaOnnxOfflinePunctuationConfig and forwarding fields (ct_transformer/num_threads/debug/provider) matches the C API’s expectations. LGTM.

@Aruxxxi Aruxxxi deleted the feature/add-punctuation-cpp-api branch August 19, 2025 07:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant