-
Notifications
You must be signed in to change notification settings - Fork 905
Fix punctuations in kokoro tts. #2458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Caution Review failedThe pull request is closed. WalkthroughThe changes adjust text preprocessing and tokenization logic in two source files. In the multi-language lexicon, input text is no longer globally lowercased; instead, lowercasing is applied at the word level for non-Chinese words. In the piper phonemize lexicon, a helper function for Unicode-to-UTF-8 conversion is added, and a space token is now inserted after each period phoneme. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant MultiLangLexicon
participant Phonemizer
User->>MultiLangLexicon: Provide input text
MultiLangLexicon->>MultiLangLexicon: For non-Chinese words, lowercase each word
MultiLangLexicon->>Phonemizer: Pass processed words
Phonemizer-->>MultiLangLexicon: Return phonemes
MultiLangLexicon-->>User: Return token IDs
sequenceDiagram
participant PiperPhonemizeLexicon
participant TokenSequence
PiperPhonemizeLexicon->>TokenSequence: For each phoneme
alt phoneme == '.'
TokenSequence->>TokenSequence: Append period token
TokenSequence->>TokenSequence: Append space token
else
TokenSequence->>TokenSequence: Append phoneme token
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Previously, there were no pauses at the end of a sentence in the text if the text contains multiple sentences.
This PR fixes that.
Summary by CodeRabbit
Bug Fixes
New Features