Skip to content

Conversation

Weves
Copy link
Contributor

@Weves Weves commented Sep 16, 2025

Description

[Provide a brief description of the changes in this PR]

How Has This Been Tested?

[Describe the tests you ran to verify your changes]

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

Summary by cubic

Replaces [[CURRENT_DATETIME]] in DR system prompts with the current timestamp (YYYY-MM-DD HH:MM:SS) and applies it to both persona and default prompts in the clarifier. This prevents the placeholder from leaking and ensures time-aware responses.

Copy link

vercel bot commented Sep 16, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
internal-search Canceled Canceled Sep 16, 2025 9:16pm

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR fixes datetime placeholder leakage in the DR (Deep Research) system by centralizing [[CURRENT_DATETIME]] replacement functionality. The changes address a bug where literal [[CURRENT_DATETIME]] strings were appearing in LLM prompts instead of being replaced with actual timestamps.

The fix introduces three main components:

  1. Centralized datetime replacement in PromptTemplate: A new _postprocess() method in backend/onyx/prompts/prompt_template.py automatically replaces [[CURRENT_DATETIME]] placeholders with formatted timestamps (format: "Wednesday January 15, 2025 14:30") whenever build() is called.

  2. DR system integration: The DR clarification node (backend/onyx/agents/agent_search/dr/nodes/dr_a0_clarification.py) now wraps both persona system prompts and default DR system prompts with PromptTemplate to ensure datetime replacement occurs.

  3. Utility function addition: A dedicated replace_current_datetime_marker() function was added to backend/onyx/agents/agent_search/dr/utils.py for DR-specific datetime replacement needs.

The changes also include comprehensive test infrastructure (backend/tests/external_dependency_unit/answer/conftest.py) with fixtures for mocking external dependencies, and an integration test (test_current_datetime_replacement.py) that validates the end-to-end datetime replacement functionality by verifying LLM responses contain properly formatted timestamps.

This fix ensures time-aware responses across the DR system by preventing placeholder leakage and providing consistent datetime formatting throughout the chat pipeline.

Confidence score: 2/5

  • This PR introduces potential issues due to inconsistent datetime formatting across the codebase and possible duplication of existing functionality
  • Score reflects concerns about architectural inconsistency and the lack of timezone awareness in the new datetime handling
  • Pay close attention to backend/onyx/prompts/prompt_template.py and backend/onyx/agents/agent_search/dr/utils.py for datetime formatting inconsistencies

6 files reviewed, 7 comments

Edit Code Review Bot Settings | Greptile

Comment on lines 97 to 101
acceptable_hours = {
now.hour,
(now - timedelta(minutes=5)).hour,
(now + timedelta(minutes=5)).hour,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Hour calculation with timedelta could create duplicate hours in the set, and the logic doesn't handle hour wraparound (e.g., 23 + 1 = 24 which is invalid). Consider using modulo arithmetic.

Comment on lines 53 to 55
now = datetime.now()
formatted = f"{now.strftime('%A')} {now.strftime('%B %d, %Y %H:%M')}"
text = text.replace("[[CURRENT_DATETIME]]", formatted)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: datetime.now() is not timezone-aware. Consider using datetime.now(timezone.utc) for consistency with other parts of the codebase that use UTC timestamps

# Replace datetime marker with a readable current datetime for LLMs
if "[[CURRENT_DATETIME]]" in text:
now = datetime.now()
formatted = f"{now.strftime('%A')} {now.strftime('%B %d, %Y %H:%M')}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: The formatting here differs from get_current_llm_day_time() in prompt_utils.py which formats as 'The current day and time is {day} {date}'. Consider using the existing function for consistency

Comment on lines 47 to 56
def _postprocess(self, text: str) -> str:
"""Apply global replacements such as [[CURRENT_DATETIME]]."""
if not text:
return text
# Replace datetime marker with a readable current datetime for LLMs
if "[[CURRENT_DATETIME]]" in text:
now = datetime.now()
formatted = f"{now.strftime('%A')} {now.strftime('%B %d, %Y %H:%M')}"
text = text.replace("[[CURRENT_DATETIME]]", formatted)
return text
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: This duplicates datetime replacement logic already present in handle_onyx_date_awareness() in prompt_utils.py. Consider consolidating the implementations

…etime_replacement.py

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 issues found across 6 files

Prompt for AI agents (all 6 issues)

Understand the root cause of the following 6 issues and fix them.


<file name="backend/onyx/prompts/prompt_template.py">

<violation number="1" location="backend/onyx/prompts/prompt_template.py:52">
Avoid hardcoding the &#39;[[CURRENT_DATETIME]]&#39; token; centralize via a shared constant to prevent drift and keep formatting behavior consistent.</violation>

<violation number="2" location="backend/onyx/prompts/prompt_template.py:54">
Duplicate LLM datetime formatting; reuse get_current_llm_day_time(..., full_sentence=False, include_day_of_week=True) to keep a single source of truth.</violation>
</file>

<file name="backend/tests/external_dependency_unit/answer/conftest.py">

<violation number="1" location="backend/tests/external_dependency_unit/answer/conftest.py:32">
Avoid printing raw exception text; it can leak sensitive information. Print only safe metadata (e.g., exception type).</violation>

<violation number="2" location="backend/tests/external_dependency_unit/answer/conftest.py:44">
len(json.get(&quot;texts&quot;, [])) can raise TypeError when texts is None; guard with or [] to ensure a list.</violation>

<violation number="3" location="backend/tests/external_dependency_unit/answer/conftest.py:73">
Incorrect patch target for query_vespa; mock likely won’t apply. Patch chunk_retrieval.query_vespa instead to prevent real Vespa calls.</violation>
</file>

<file name="backend/tests/external_dependency_unit/answer/test_stream_chat_message_objects.py">

<violation number="1" location="backend/tests/external_dependency_unit/answer/test_stream_chat_message_objects.py:81">
Comment claims allowed_tool_ids is set to [], but the request does not set it, causing misleading documentation. Update the comment or pass allowed_tool_ids=[].</violation>
</file>

React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.

):
resp = MagicMock()
if "encoder/bi-encoder-embed" in url:
num_texts = len(json.get("texts", [])) if json else 1
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

len(json.get("texts", [])) can raise TypeError when texts is None; guard with or [] to ensure a list.

Prompt for AI agents
Address the following comment on backend/tests/external_dependency_unit/answer/conftest.py at line 44:

<comment>len(json.get(&quot;texts&quot;, [])) can raise TypeError when texts is None; guard with or [] to ensure a list.</comment>

<file context>
@@ -0,0 +1,80 @@
+    ):
+        resp = MagicMock()
+        if &quot;encoder/bi-encoder-embed&quot; in url:
+            num_texts = len(json.get(&quot;texts&quot;, [])) if json else 1
+            resp.status_code = 200
+            resp.json.return_value = {&quot;embeddings&quot;: [[0.0] * 768] * num_texts}
</file context>
Suggested change
num_texts = len(json.get("texts", [])) if json else 1
num_texts = len(json.get("texts") or []) if json else 1
Fix with Cubic

@pytest.fixture
def mock_vespa_query():
"""Stub Vespa query to a safe empty response to avoid CI flakiness."""
with patch("onyx.document_index.vespa.index.query_vespa", return_value=[]):
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect patch target for query_vespa; mock likely won’t apply. Patch chunk_retrieval.query_vespa instead to prevent real Vespa calls.

Prompt for AI agents
Address the following comment on backend/tests/external_dependency_unit/answer/conftest.py at line 73:

<comment>Incorrect patch target for query_vespa; mock likely won’t apply. Patch chunk_retrieval.query_vespa instead to prevent real Vespa calls.</comment>

<file context>
@@ -0,0 +1,80 @@
+@pytest.fixture
+def mock_vespa_query():
+    &quot;&quot;&quot;Stub Vespa query to a safe empty response to avoid CI flakiness.&quot;&quot;&quot;
+    with patch(&quot;onyx.document_index.vespa.index.query_vespa&quot;, return_value=[]):
+        yield
+
</file context>
Fix with Cubic

from tests.external_dependency_unit.conftest import create_test_user


def test_stream_chat_current_date_response(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! I've always avoided tests that hit external APIs (seems we're hitting an LLM api) but as a smoke test it's probably pretty reasonable

I guess integration tests also hit external APIs

@Weves Weves merged commit 11ec603 into main Sep 16, 2025
52 of 53 checks passed
@Weves Weves deleted the fix-current-time branch September 16, 2025 21:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants