Skip to content

Conversation

@mzegla
Copy link
Collaborator

@mzegla mzegla commented Oct 22, 2025

No description provided.

@mzegla mzegla marked this pull request as ready for review October 22, 2025 14:39
@mzegla mzegla requested a review from Copilot October 22, 2025 14:39
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request removes the default chat template fallback mechanism and unifies template loading to rely on model-provided templates. The changes transition from using a hardcoded default template to requiring proper chat templates in model files, with support for creating dummy templates during model preparation when needed.

Key Changes:

  • Removed default template fallback logic in both Python and C++ builds
  • Updated model preparation scripts to create dummy chat templates for models lacking them
  • Modified tests to copy proper chat templates and updated expectations to match new behavior

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/llm/servable_initializer.hpp Removed declaration of loadDefaultTemplateProcessorIfNeeded method for C++ builds
src/llm/servable_initializer.cpp Eliminated default template constant and fallback loading logic; simplified to use only model-provided templates
src/llm/servable.cpp Added exception handling for chat template application failures
src/test/llm/llmtemplate_test.cpp Updated tests to copy real tokenizer files and chat templates; adjusted expected outputs to match new template behavior
src/test/llm/llmnode_test.cpp Updated test helper methods to support chat template application and adjusted test expectations
src/test/llm/assisted_decoding_test.cpp Modified test helper to apply chat templates when needed and uncommented previously skipped assertions
windows_prepare_llm_models.bat Added logic to create dummy chat template file if not present after model download
prepare_llm_models.sh Renamed model variable and added dummy chat template creation for models without templates
Comments suppressed due to low confidence (1)

windows_prepare_llm_models.bat:1

  • Inconsistent Jinja2 delimiter syntax: line 85 uses {% instead of {%% while other lines use {%%. This inconsistency may cause template parsing issues.
::

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +137 to +145
SPDLOG_ERROR("BOS token was not found in model files.");
return;
}
if (eosToken.empty()) {
SPDLOG_ERROR("EOS token was not found in model files.");
return;
}
if (chatTemplate.empty()) {
SPDLOG_ERROR("Chat template was not found in model files.");
Copy link

Copilot AI Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error messages lack context about the impact of these failures. Consider indicating that chat template loading will be skipped or that the servable will not support chat completions endpoint when these tokens/templates are missing.

Suggested change
SPDLOG_ERROR("BOS token was not found in model files.");
return;
}
if (eosToken.empty()) {
SPDLOG_ERROR("EOS token was not found in model files.");
return;
}
if (chatTemplate.empty()) {
SPDLOG_ERROR("Chat template was not found in model files.");
SPDLOG_ERROR("BOS token was not found in model files. {}", CHAT_TEMPLATE_WARNING_MESSAGE);
return;
}
if (eosToken.empty()) {
SPDLOG_ERROR("EOS token was not found in model files. {}", CHAT_TEMPLATE_WARNING_MESSAGE);
return;
}
if (chatTemplate.empty()) {
SPDLOG_ERROR("Chat template was not found in model files. {}", CHAT_TEMPLATE_WARNING_MESSAGE);

Copilot uses AI. Check for mistakes.
Comment on lines 77 to 72
dummy_chat_template="{% for message in messages %}\
{% if message['role'] == 'user' %}{{ 'User: ' + message['content'] }}\
{% elif message['role'] == 'system' %}{{ '<|system|>\n' + message['content'] + eos_token }}\
{% elif message['role'] == 'assistant' %}{{ message['content'] + eos_token }}\
{% endif %}\
{% endfor %}"
Copy link

Copilot AI Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dummy chat template is duplicated between Windows and Unix scripts. Consider extracting this template to a shared file or document that both scripts can reference to avoid maintenance issues if the template needs updating.

Copilot uses AI. Check for mistakes.
mzegla added a commit that referenced this pull request Oct 23, 2025
Intermediate change for merging:
#3722
@mzegla mzegla force-pushed the remove_default_template branch from bd9990b to a6b8d9c Compare October 24, 2025 11:45
@mzegla mzegla force-pushed the remove_default_template branch from d63d874 to d1ca964 Compare October 27, 2025 14:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 74 to 80
set "dummy_chat_template={%% for message in messages %%}^
{%% if message['role'] == 'user' %%}{{ 'User: ' + message['content'] }}^
{%% elif message['role'] == 'system' %%}{{ '<|system|>\n' + message['content'] + eos_token }}^
{%% elif message['role'] == 'assistant' %%}{{ message['content'] + eos_token }}^
{%% endif %%}^
{%% endfor %%}"
echo !dummy_chat_template! > "%~1\%TEXT_GENERATION_MODEL%\chat_template.jinja"
Copy link

Copilot AI Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The dummy chat template duplicates the same template logic found in prepare_llm_models.sh (lines 67-72). Consider extracting this template to a shared file or configuration to maintain consistency and reduce duplication.

Suggested change
set "dummy_chat_template={%% for message in messages %%}^
{%% if message['role'] == 'user' %%}{{ 'User: ' + message['content'] }}^
{%% elif message['role'] == 'system' %%}{{ '<|system|>\n' + message['content'] + eos_token }}^
{%% elif message['role'] == 'assistant' %%}{{ message['content'] + eos_token }}^
{%% endif %%}^
{%% endfor %%}"
echo !dummy_chat_template! > "%~1\%TEXT_GENERATION_MODEL%\chat_template.jinja"
copy /Y "demos\common\export_models\chat_template.jinja" "%~1\%TEXT_GENERATION_MODEL%\chat_template.jinja"

Copilot uses AI. Check for mistakes.

servable = std::make_shared<ContinuousBatchingServable>();
servable->getProperties()->modelsPath = directoryPath;
servable->getProperties()->tokenizer = ov::genai::Tokenizer(directoryPath);
Copy link

Copilot AI Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The tokenizer is initialized after copying the tokenizer files (lines 73-87), but the initialization could fail if the files are corrupted or incomplete. Consider adding validation or error handling after the tokenizer initialization.

Copilot uses AI. Check for mistakes.
@mzegla mzegla requested a review from Copilot October 29, 2025 12:36
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if (errorFound)
return;
// At this point tokenizer cannot be uninitialized as we need to access its methods for prepare for chat template processing
if (properties->tokenizer == ov::genai::Tokenizer()) {
Copy link

Copilot AI Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Comparing tokenizer equality against a default-constructed instance may not reliably detect uninitialized state. Consider adding an explicit is_initialized() or is_valid() method to the Tokenizer class, or check for null/empty internal state instead.

Suggested change
if (properties->tokenizer == ov::genai::Tokenizer()) {
// Use a more robust check for tokenizer initialization
if (properties->tokenizer.get_vocab_size() == 0) {

Copilot uses AI. Check for mistakes.
std::filesystem::copy_file(srcDetokenizerBinPath, dstDetokenizerBinPath, std::filesystem::copy_options::overwrite_existing);
}

void TearDown() {
Copy link

Copilot AI Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The servable.reset() call may be redundant since TestWithTempDir::TearDown() likely handles cleanup. If servable holds resources that must be released before directory removal, add a comment explaining why explicit reset is needed before remove_all.

Suggested change
void TearDown() {
void TearDown() {
// Explicitly reset servable before removing the directory to ensure all resources
// (such as open files or handles) are released before directory deletion.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants