-
Notifications
You must be signed in to change notification settings - Fork 232
Remove default template and unify template loading #3722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request removes the default chat template fallback mechanism and unifies template loading to rely on model-provided templates. The changes transition from using a hardcoded default template to requiring proper chat templates in model files, with support for creating dummy templates during model preparation when needed.
Key Changes:
- Removed default template fallback logic in both Python and C++ builds
- Updated model preparation scripts to create dummy chat templates for models lacking them
- Modified tests to copy proper chat templates and updated expectations to match new behavior
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/llm/servable_initializer.hpp | Removed declaration of loadDefaultTemplateProcessorIfNeeded method for C++ builds |
| src/llm/servable_initializer.cpp | Eliminated default template constant and fallback loading logic; simplified to use only model-provided templates |
| src/llm/servable.cpp | Added exception handling for chat template application failures |
| src/test/llm/llmtemplate_test.cpp | Updated tests to copy real tokenizer files and chat templates; adjusted expected outputs to match new template behavior |
| src/test/llm/llmnode_test.cpp | Updated test helper methods to support chat template application and adjusted test expectations |
| src/test/llm/assisted_decoding_test.cpp | Modified test helper to apply chat templates when needed and uncommented previously skipped assertions |
| windows_prepare_llm_models.bat | Added logic to create dummy chat template file if not present after model download |
| prepare_llm_models.sh | Renamed model variable and added dummy chat template creation for models without templates |
Comments suppressed due to low confidence (1)
windows_prepare_llm_models.bat:1
- Inconsistent Jinja2 delimiter syntax: line 85 uses
{%instead of{%%while other lines use{%%. This inconsistency may cause template parsing issues.
::
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| SPDLOG_ERROR("BOS token was not found in model files."); | ||
| return; | ||
| } | ||
| if (eosToken.empty()) { | ||
| SPDLOG_ERROR("EOS token was not found in model files."); | ||
| return; | ||
| } | ||
| if (chatTemplate.empty()) { | ||
| SPDLOG_ERROR("Chat template was not found in model files."); |
Copilot
AI
Oct 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error messages lack context about the impact of these failures. Consider indicating that chat template loading will be skipped or that the servable will not support chat completions endpoint when these tokens/templates are missing.
| SPDLOG_ERROR("BOS token was not found in model files."); | |
| return; | |
| } | |
| if (eosToken.empty()) { | |
| SPDLOG_ERROR("EOS token was not found in model files."); | |
| return; | |
| } | |
| if (chatTemplate.empty()) { | |
| SPDLOG_ERROR("Chat template was not found in model files."); | |
| SPDLOG_ERROR("BOS token was not found in model files. {}", CHAT_TEMPLATE_WARNING_MESSAGE); | |
| return; | |
| } | |
| if (eosToken.empty()) { | |
| SPDLOG_ERROR("EOS token was not found in model files. {}", CHAT_TEMPLATE_WARNING_MESSAGE); | |
| return; | |
| } | |
| if (chatTemplate.empty()) { | |
| SPDLOG_ERROR("Chat template was not found in model files. {}", CHAT_TEMPLATE_WARNING_MESSAGE); |
prepare_llm_models.sh
Outdated
| dummy_chat_template="{% for message in messages %}\ | ||
| {% if message['role'] == 'user' %}{{ 'User: ' + message['content'] }}\ | ||
| {% elif message['role'] == 'system' %}{{ '<|system|>\n' + message['content'] + eos_token }}\ | ||
| {% elif message['role'] == 'assistant' %}{{ message['content'] + eos_token }}\ | ||
| {% endif %}\ | ||
| {% endfor %}" |
Copilot
AI
Oct 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dummy chat template is duplicated between Windows and Unix scripts. Consider extracting this template to a shared file or document that both scripts can reference to avoid maintenance issues if the template needs updating.
Intermediate change for merging: #3722
bd9990b to
a6b8d9c
Compare
d63d874 to
d1ca964
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
windows_prepare_llm_models.bat
Outdated
| set "dummy_chat_template={%% for message in messages %%}^ | ||
| {%% if message['role'] == 'user' %%}{{ 'User: ' + message['content'] }}^ | ||
| {%% elif message['role'] == 'system' %%}{{ '<|system|>\n' + message['content'] + eos_token }}^ | ||
| {%% elif message['role'] == 'assistant' %%}{{ message['content'] + eos_token }}^ | ||
| {%% endif %%}^ | ||
| {%% endfor %%}" | ||
| echo !dummy_chat_template! > "%~1\%TEXT_GENERATION_MODEL%\chat_template.jinja" |
Copilot
AI
Oct 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The dummy chat template duplicates the same template logic found in prepare_llm_models.sh (lines 67-72). Consider extracting this template to a shared file or configuration to maintain consistency and reduce duplication.
| set "dummy_chat_template={%% for message in messages %%}^ | |
| {%% if message['role'] == 'user' %%}{{ 'User: ' + message['content'] }}^ | |
| {%% elif message['role'] == 'system' %%}{{ '<|system|>\n' + message['content'] + eos_token }}^ | |
| {%% elif message['role'] == 'assistant' %%}{{ message['content'] + eos_token }}^ | |
| {%% endif %%}^ | |
| {%% endfor %%}" | |
| echo !dummy_chat_template! > "%~1\%TEXT_GENERATION_MODEL%\chat_template.jinja" | |
| copy /Y "demos\common\export_models\chat_template.jinja" "%~1\%TEXT_GENERATION_MODEL%\chat_template.jinja" |
src/test/llm/llmtemplate_test.cpp
Outdated
|
|
||
| servable = std::make_shared<ContinuousBatchingServable>(); | ||
| servable->getProperties()->modelsPath = directoryPath; | ||
| servable->getProperties()->tokenizer = ov::genai::Tokenizer(directoryPath); |
Copilot
AI
Oct 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The tokenizer is initialized after copying the tokenizer files (lines 73-87), but the initialization could fail if the files are corrupted or incomplete. Consider adding validation or error handling after the tokenizer initialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (errorFound) | ||
| return; | ||
| // At this point tokenizer cannot be uninitialized as we need to access its methods for prepare for chat template processing | ||
| if (properties->tokenizer == ov::genai::Tokenizer()) { |
Copilot
AI
Oct 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Comparing tokenizer equality against a default-constructed instance may not reliably detect uninitialized state. Consider adding an explicit is_initialized() or is_valid() method to the Tokenizer class, or check for null/empty internal state instead.
| if (properties->tokenizer == ov::genai::Tokenizer()) { | |
| // Use a more robust check for tokenizer initialization | |
| if (properties->tokenizer.get_vocab_size() == 0) { |
| std::filesystem::copy_file(srcDetokenizerBinPath, dstDetokenizerBinPath, std::filesystem::copy_options::overwrite_existing); | ||
| } | ||
|
|
||
| void TearDown() { |
Copilot
AI
Oct 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The servable.reset() call may be redundant since TestWithTempDir::TearDown() likely handles cleanup. If servable holds resources that must be released before directory removal, add a comment explaining why explicit reset is needed before remove_all.
| void TearDown() { | |
| void TearDown() { | |
| // Explicitly reset servable before removing the directory to ensure all resources | |
| // (such as open files or handles) are released before directory deletion. |
No description provided.