-
Notifications
You must be signed in to change notification settings - Fork 1.9k
fix: Improve datetime replacement #5425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR fixes datetime placeholder leakage in the DR (Deep Research) system by centralizing [[CURRENT_DATETIME]]
replacement functionality. The changes address a bug where literal [[CURRENT_DATETIME]]
strings were appearing in LLM prompts instead of being replaced with actual timestamps.
The fix introduces three main components:
-
Centralized datetime replacement in PromptTemplate: A new
_postprocess()
method inbackend/onyx/prompts/prompt_template.py
automatically replaces[[CURRENT_DATETIME]]
placeholders with formatted timestamps (format: "Wednesday January 15, 2025 14:30") wheneverbuild()
is called. -
DR system integration: The DR clarification node (
backend/onyx/agents/agent_search/dr/nodes/dr_a0_clarification.py
) now wraps both persona system prompts and default DR system prompts withPromptTemplate
to ensure datetime replacement occurs. -
Utility function addition: A dedicated
replace_current_datetime_marker()
function was added tobackend/onyx/agents/agent_search/dr/utils.py
for DR-specific datetime replacement needs.
The changes also include comprehensive test infrastructure (backend/tests/external_dependency_unit/answer/conftest.py
) with fixtures for mocking external dependencies, and an integration test (test_current_datetime_replacement.py
) that validates the end-to-end datetime replacement functionality by verifying LLM responses contain properly formatted timestamps.
This fix ensures time-aware responses across the DR system by preventing placeholder leakage and providing consistent datetime formatting throughout the chat pipeline.
Confidence score: 2/5
- This PR introduces potential issues due to inconsistent datetime formatting across the codebase and possible duplication of existing functionality
- Score reflects concerns about architectural inconsistency and the lack of timezone awareness in the new datetime handling
- Pay close attention to
backend/onyx/prompts/prompt_template.py
andbackend/onyx/agents/agent_search/dr/utils.py
for datetime formatting inconsistencies
6 files reviewed, 7 comments
backend/tests/external_dependency_unit/answer/test_current_datetime_replacement.py
Outdated
Show resolved
Hide resolved
acceptable_hours = { | ||
now.hour, | ||
(now - timedelta(minutes=5)).hour, | ||
(now + timedelta(minutes=5)).hour, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Hour calculation with timedelta could create duplicate hours in the set, and the logic doesn't handle hour wraparound (e.g., 23 + 1 = 24 which is invalid). Consider using modulo arithmetic.
now = datetime.now() | ||
formatted = f"{now.strftime('%A')} {now.strftime('%B %d, %Y %H:%M')}" | ||
text = text.replace("[[CURRENT_DATETIME]]", formatted) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: datetime.now() is not timezone-aware. Consider using datetime.now(timezone.utc) for consistency with other parts of the codebase that use UTC timestamps
# Replace datetime marker with a readable current datetime for LLMs | ||
if "[[CURRENT_DATETIME]]" in text: | ||
now = datetime.now() | ||
formatted = f"{now.strftime('%A')} {now.strftime('%B %d, %Y %H:%M')}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: The formatting here differs from get_current_llm_day_time() in prompt_utils.py which formats as 'The current day and time is {day} {date}'. Consider using the existing function for consistency
def _postprocess(self, text: str) -> str: | ||
"""Apply global replacements such as [[CURRENT_DATETIME]].""" | ||
if not text: | ||
return text | ||
# Replace datetime marker with a readable current datetime for LLMs | ||
if "[[CURRENT_DATETIME]]" in text: | ||
now = datetime.now() | ||
formatted = f"{now.strftime('%A')} {now.strftime('%B %d, %Y %H:%M')}" | ||
text = text.replace("[[CURRENT_DATETIME]]", formatted) | ||
return text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: This duplicates datetime replacement logic already present in handle_onyx_date_awareness() in prompt_utils.py. Consider consolidating the implementations
564647b
to
99c00ca
Compare
…etime_replacement.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6 issues found across 6 files
Prompt for AI agents (all 6 issues)
Understand the root cause of the following 6 issues and fix them.
<file name="backend/onyx/prompts/prompt_template.py">
<violation number="1" location="backend/onyx/prompts/prompt_template.py:52">
Avoid hardcoding the '[[CURRENT_DATETIME]]' token; centralize via a shared constant to prevent drift and keep formatting behavior consistent.</violation>
<violation number="2" location="backend/onyx/prompts/prompt_template.py:54">
Duplicate LLM datetime formatting; reuse get_current_llm_day_time(..., full_sentence=False, include_day_of_week=True) to keep a single source of truth.</violation>
</file>
<file name="backend/tests/external_dependency_unit/answer/conftest.py">
<violation number="1" location="backend/tests/external_dependency_unit/answer/conftest.py:32">
Avoid printing raw exception text; it can leak sensitive information. Print only safe metadata (e.g., exception type).</violation>
<violation number="2" location="backend/tests/external_dependency_unit/answer/conftest.py:44">
len(json.get("texts", [])) can raise TypeError when texts is None; guard with or [] to ensure a list.</violation>
<violation number="3" location="backend/tests/external_dependency_unit/answer/conftest.py:73">
Incorrect patch target for query_vespa; mock likely won’t apply. Patch chunk_retrieval.query_vespa instead to prevent real Vespa calls.</violation>
</file>
<file name="backend/tests/external_dependency_unit/answer/test_stream_chat_message_objects.py">
<violation number="1" location="backend/tests/external_dependency_unit/answer/test_stream_chat_message_objects.py:81">
Comment claims allowed_tool_ids is set to [], but the request does not set it, causing misleading documentation. Update the comment or pass allowed_tool_ids=[].</violation>
</file>
React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai
to give feedback, ask questions, or re-run the review.
): | ||
resp = MagicMock() | ||
if "encoder/bi-encoder-embed" in url: | ||
num_texts = len(json.get("texts", [])) if json else 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(json.get("texts", [])) can raise TypeError when texts is None; guard with or [] to ensure a list.
Prompt for AI agents
Address the following comment on backend/tests/external_dependency_unit/answer/conftest.py at line 44:
<comment>len(json.get("texts", [])) can raise TypeError when texts is None; guard with or [] to ensure a list.</comment>
<file context>
@@ -0,0 +1,80 @@
+ ):
+ resp = MagicMock()
+ if "encoder/bi-encoder-embed" in url:
+ num_texts = len(json.get("texts", [])) if json else 1
+ resp.status_code = 200
+ resp.json.return_value = {"embeddings": [[0.0] * 768] * num_texts}
</file context>
num_texts = len(json.get("texts", [])) if json else 1 | |
num_texts = len(json.get("texts") or []) if json else 1 |
@pytest.fixture | ||
def mock_vespa_query(): | ||
"""Stub Vespa query to a safe empty response to avoid CI flakiness.""" | ||
with patch("onyx.document_index.vespa.index.query_vespa", return_value=[]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect patch target for query_vespa; mock likely won’t apply. Patch chunk_retrieval.query_vespa instead to prevent real Vespa calls.
Prompt for AI agents
Address the following comment on backend/tests/external_dependency_unit/answer/conftest.py at line 73:
<comment>Incorrect patch target for query_vespa; mock likely won’t apply. Patch chunk_retrieval.query_vespa instead to prevent real Vespa calls.</comment>
<file context>
@@ -0,0 +1,80 @@
+@pytest.fixture
+def mock_vespa_query():
+ """Stub Vespa query to a safe empty response to avoid CI flakiness."""
+ with patch("onyx.document_index.vespa.index.query_vespa", return_value=[]):
+ yield
+
</file context>
backend/tests/external_dependency_unit/answer/test_stream_chat_message_objects.py
Outdated
Show resolved
Hide resolved
from tests.external_dependency_unit.conftest import create_test_user | ||
|
||
|
||
def test_stream_chat_current_date_response( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! I've always avoided tests that hit external APIs (seems we're hitting an LLM api) but as a smoke test it's probably pretty reasonable
I guess integration tests also hit external APIs
Description
[Provide a brief description of the changes in this PR]
How Has This Been Tested?
[Describe the tests you ran to verify your changes]
Backporting (check the box to trigger backport action)
Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.
Summary by cubic
Replaces [[CURRENT_DATETIME]] in DR system prompts with the current timestamp (YYYY-MM-DD HH:MM:SS) and applies it to both persona and default prompts in the clarifier. This prevents the placeholder from leaking and ensures time-aware responses.