feat: frontend refactor + DR #5225

Weves · 2025-08-20T17:41:17Z

Description

Fixes https://linear.app/danswer/issue/DAN-2269/deep-research-ui
Fixes https://linear.app/danswer/issue/DAN-2270/deep-research-backend

How Has This Been Tested?

Tested locally

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

This PR should be backported (make sure to check that the backport attempt succeeds)
[Optional] Override Linear Check

Summary by cubic

Adds Deep Research v2 with a clarifier→orchestrator→closer workflow and packetized streaming, plus DB support for research iterations and answer purpose. Updates KB search, tools, and chat piping to the new streaming model.

New Features
- Deep Research agent with nodes (Clarifier, Orchestrator, Closer) and sub-agents: internal search, internet search, knowledge graph, image generation, and custom tool.
- Packet-based streaming model (MessageStart/Delta, SectionEnd, SearchTool*, ImageGeneration*, Reasoning*, Citation*, OverallStop) and unified streaming utils.
- Database: adds research_type, research_plan, research_answer_purpose on chat messages; new research_agent_iteration tables; migration to move prior agent data.
- Tools: new KnowledgeGraphTool; all tools expose a stable id; builder wires ids through constructors.
- KB search refactor: clearer step descriptions, streaming helpers renamed, improved answer generation and state shape.
- Improved citation processing with a graph-based processor; consistent CitationInfo usage.
- Chat API/server: returns packets for messages, supports replay from DB, and routes streaming through new models.
- Prompts: comprehensive DR prompts and a safe PromptTemplate helper.
- Config: adds research_type to GraphSearchConfig; KG Beta persona description and constants.
Migration
- Run Alembic migrations (adds research fields/tables and migrates legacy agent data).
- Tool constructors now require tool_id; ensure custom/image/internet tools pass it.
- grounded_source_name is nullable; KG extraction and configs updated accordingly.

vercel · 2025-08-20T17:41:23Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
internal-search	Ready	Preview	Comment	Aug 26, 2025 5:45am

evan-onyx

some nits

evan-onyx · 2025-08-24T22:08:55Z

backend/onyx/agents/agent_search/basic/utils.py

+                    and generate_final_answer
+                    and response_part.answer_piece
+                ):
+                    if chat_message_id is None:


This should be checked outside the for loop

evan-onyx · 2025-08-24T22:10:52Z

backend/onyx/agents/agent_search/basic/utils.py

-                )
+
+                if (
+                    hasattr(response_part, "answer_piece")


this should check isinstance(response_part, (OnyxAnswerPiece, AgentAnswerPiece, ...))

evan-onyx · 2025-08-24T22:13:22Z

backend/onyx/agents/agent_search/basic/utils.py

+                        writer,
+                    )
+
+                else:


comment explaining what this case is used for

evan-onyx · 2025-08-24T22:13:55Z

backend/onyx/agents/agent_search/basic/utils.py

+                    write_custom_event(
+                        ind,
+                        MessageDelta(
+                            content=response_part.answer_piece, type="message_delta"


message_delta should be a global constant

Done. Removed all message_delta as they are defaults of the class anyway with only one allowed value

evan-onyx · 2025-08-24T22:16:26Z

backend/onyx/agents/agent_search/dr/dr_prompt_builder.py

+        f"{tool_name}: {tool.cost}" for tool_name, tool in available_tools.items()
+    )
+
+    tool_differentiations: list[str] = []


turn this into a list comp

evan-onyx · 2025-08-25T00:36:33Z

backend/onyx/prompts/dr_prompts.py

+focussing on providing the citations and providing some answer facts. But the \
+main content should be in the cited documents for each sub-question.
+ - Pay close attention to whether the sub-answers mention whether the topic of interest \
+was explicitly mentioned! If not you cannot reliably use that information to construct your answer, \


not you -> delete "not"

evan-onyx · 2025-08-25T00:37:09Z

backend/onyx/prompts/dr_prompts.py

+FINAL_ANSWER_PROMPT_WITHOUT_SUB_ANSWERS = PromptTemplate(
+    f"""
+You are great at answering a user question based \
+a list of documents that were retrieved in response to subh-questions, and possibly also \


subh -> sub

evan-onyx · 2025-08-25T00:51:46Z

backend/onyx/prompts/dr_prompts.py

+they MUST NOT be part of the rewritten search query... take it out in that case! \
+Particularly look for expressions like '...in our Google docs...', '...in our \
+Google calls', etc., in which case the source type is 'google_drive' or 'gong' \
+should not be included in the rewritten query!


add "which" or "and" to the start of the line

changed wording a bit, though differently!

evan-onyx · 2025-08-25T00:52:59Z

backend/onyx/prompts/dr_prompts.py

+You are great at 1) determining whether a question can be answered \
+by you directly using your knowledge alone and the chat history (if any), and 2) actually \
+answering the question/request, \
+if the request DOES NOT require or would strongly benefit from ANY external tool \


evan-onyx · 2025-08-25T00:54:11Z

backend/onyx/prompts/dr_prompts.py

+"""
+
+
+"""


should be commented out

cubic-dev-ai

40 issues found across 297 files

Note: This PR contains a large number of files. cubic only reviews up to 150 files per PR, so some files may not have been reviewed.

_{React with 👍 or 👎 to teach cubic. You can also tag @cubic-dev-ai to give feedback, ask questions, or re-run the review.}

cubic-dev-ai · 2025-08-25T22:57:38Z

backend/onyx/agents/agent_search/dr/utils.py

+    return (
+        "...\n"
+        if len(chat_history) > len(past_messages)
+        else ""


Missing concatenation in ternary return causes syntax error or drops chat history content.

Prompt for AI agents

Address the following comment on backend/onyx/agents/agent_search/dr/utils.py at line 192: <comment>Missing concatenation in ternary return causes syntax error or drops chat history content.</comment> <file context> @@ -0,0 +1,333 @@ +import re + +from langchain.schema.messages import BaseMessage +from langchain.schema.messages import HumanMessage +from sqlalchemy.orm import Session + +from onyx.agents.agent_search.dr.enums import ResearchAnswerPurpose +from onyx.agents.agent_search.dr.enums import ResearchType +from onyx.agents.agent_search.dr.models import AggregatedDRContext </file context>

cubic-dev-ai · 2025-08-25T22:57:38Z

backend/onyx/agents/agent_search/dr/sub_agents/basic_search/dr_basic_search_2_act.py

+            claims,
+        ) = extract_document_citations(answer_string, claims)
+        cited_documents = {
+            citation_number: retrieved_docs[citation_number - 1]


Out-of-bounds citation indexing may raise IndexError in DEEP path

Prompt for AI agents

Address the following comment on backend/onyx/agents/agent_search/dr/sub_agents/basic_search/dr_basic_search_2_act.py at line 223: <comment>Out-of-bounds citation indexing may raise IndexError in DEEP path</comment> <file context> @@ -0,0 +1,258 @@ +import re +from datetime import datetime +from typing import cast + +from langchain_core.runnables import RunnableConfig +from langgraph.types import StreamWriter + +from onyx.agents.agent_search.dr.enums import ResearchType +from onyx.agents.agent_search.dr.models import BaseSearchProcessingResponse </file context>

cubic-dev-ai · 2025-08-25T22:57:38Z

backend/onyx/agents/agent_search/dr/sub_agents/basic_search/dr_basic_search_2_act.py

+    if not state.available_tools:
+        raise ValueError("available_tools is not set")
+
+    search_tool_info = state.available_tools[state.tools_used[-1]]


Potential IndexError/KeyError accessing last used tool without guards

Prompt for AI agents

Address the following comment on backend/onyx/agents/agent_search/dr/sub_agents/basic_search/dr_basic_search_2_act.py at line 69: <comment>Potential IndexError/KeyError accessing last used tool without guards</comment> <file context> @@ -0,0 +1,258 @@ +import re +from datetime import datetime +from typing import cast + +from langchain_core.runnables import RunnableConfig +from langgraph.types import StreamWriter + +from onyx.agents.agent_search.dr.enums import ResearchType +from onyx.agents.agent_search.dr.models import BaseSearchProcessingResponse </file context>

cubic-dev-ai · 2025-08-25T22:57:38Z

backend/onyx/db/slack_channel_config.py

 from onyx.db.persona import upsert_persona
 from onyx.db.prompts import get_default_prompt
-from onyx.tools.built_in_tools import get_search_tool
+from onyx.tools.built_in_tools import get_builtin_tool


Incorrect SQLAlchemy filter uses is (Python identity) instead of SQL expression, making the default-exists check always false.

Prompt for AI agents

Address the following comment on backend/onyx/db/slack_channel_config.py at line 19: <comment>Incorrect SQLAlchemy filter uses `is` (Python identity) instead of SQL expression, making the default-exists check always false.</comment> <file context> @@ -16,7 +16,8 @@ from onyx.db.persona import mark_persona_as_deleted from onyx.db.persona import upsert_persona from onyx.db.prompts import get_default_prompt -from onyx.tools.built_in_tools import get_search_tool +from onyx.tools.built_in_tools import get_builtin_tool +from onyx.tools.tool_implementations.search.search_tool import SearchTool from onyx.utils.errors import EERequiredError </file context>

cubic-dev-ai · 2025-08-25T22:57:38Z

backend/onyx/tools/tool_implementations/knowledge_graph/knowledge_graph_tool.py

+        llm: LLM,
+        force_run: bool = False,
+    ) -> dict[str, Any] | None:
+        raise ValueError(


KnowledgeGraphTool is registered as built-in but all execution methods raise ValueError; if surfaced, selecting or evaluating it will crash both tool-calling and non-tool-calling flows.

Prompt for AI agents

Address the following comment on backend/onyx/tools/tool_implementations/knowledge_graph/knowledge_graph_tool.py at line 69: <comment>KnowledgeGraphTool is registered as built-in but all execution methods raise ValueError; if surfaced, selecting or evaluating it will crash both tool-calling and non-tool-calling flows.</comment> <file context> @@ -0,0 +1,106 @@ +from collections.abc import Generator +from typing import Any + +from onyx.chat.prompt_builder.answer_prompt_builder import AnswerPromptBuilder +from onyx.llm.interfaces import LLM +from onyx.llm.models import PreviousMessage +from onyx.tools.message import ToolCallSummary +from onyx.tools.models import ToolResponse +from onyx.tools.tool import Tool </file context>

Leave for now

cubic-dev-ai · 2025-08-25T22:57:42Z

backend/onyx/tools/tool.py

+    # TODO: extra review regarding coding style
+    @property
+    def llm_name(self) -> str:
+        return self.display_name


llm_name returns display_name, which is not guaranteed unique and can collide when used as a dict key; prefer a stable unique identifier (e.g., name) to avoid overwriting tools.

Prompt for AI agents

Address the following comment on backend/onyx/tools/tool.py at line 48: <comment>llm_name returns display_name, which is not guaranteed unique and can collide when used as a dict key; prefer a stable unique identifier (e.g., name) to avoid overwriting tools.</comment> <file context> @@ -35,6 +40,13 @@ def description(self) -> str: def display_name(self) -> str: raise NotImplementedError + # Added to make tools work better with LLMs in prompts. Should be unique + # TODO: looks at ways how to best ensure uniqueness. + # TODO: extra review regarding coding style + @property + def llm_name(self) -> str: + return self.display_name </file context>

Suggested change

return self.display_name

return self.name

cubic-dev-ai · 2025-08-25T22:57:42Z

backend/onyx/agents/agent_search/dr/graph_builder.py

+    graph.add_node(DRPath.ORCHESTRATOR, orchestrator)
+
+    basic_search_graph = dr_basic_search_graph_builder().compile()
+    graph.add_node(DRPath.INTERNAL_SEARCH, basic_search_graph)


Compiled subgraphs with SubAgentMainState are added as nodes to a parent graph with MainState, causing state schema mismatch at runtime

Prompt for AI agents

Address the following comment on backend/onyx/agents/agent_search/dr/graph_builder.py at line 50: <comment>Compiled subgraphs with SubAgentMainState are added as nodes to a parent graph with MainState, causing state schema mismatch at runtime</comment> <file context> @@ -0,0 +1,88 @@ +from langgraph.graph import END +from langgraph.graph import START +from langgraph.graph import StateGraph + +from onyx.agents.agent_search.dr.conditional_edges import completeness_router +from onyx.agents.agent_search.dr.conditional_edges import decision_router +from onyx.agents.agent_search.dr.enums import DRPath +from onyx.agents.agent_search.dr.nodes.dr_a0_clarification import clarifier +from onyx.agents.agent_search.dr.nodes.dr_a1_orchestrator import orchestrator </file context>

cubic-dev-ai · 2025-08-25T22:57:42Z

backend/alembic/versions/5ae8240accb3_add_research_agent_database_tables_and_.py

+            sa.ForeignKey("chat_message.id", ondelete="CASCADE"),
+            nullable=False,
+        ),
+        sa.Column("iteration_nr", sa.Integer(), nullable=False),


Sub-steps are not linked to a specific iteration via a foreign key, allowing inconsistent data. Add an iteration_id FK (or a composite FK using a unique constraint) to enforce referential integrity.

Prompt for AI agents

Address the following comment on backend/alembic/versions/5ae8240accb3_add_research_agent_database_tables_and_.py at line 42: <comment>Sub-steps are not linked to a specific iteration via a foreign key, allowing inconsistent data. Add an iteration_id FK (or a composite FK using a unique constraint) to enforce referential integrity.</comment> <file context> @@ -0,0 +1,102 @@ +"""add research agent database tables and chat message research fields + +Revision ID: 5ae8240accb3 +Revises: b558f51620b4 +Create Date: 2025-08-06 14:29:24.691388 + +""" + +from alembic import op </file context>

cubic-dev-ai · 2025-08-25T22:57:42Z

web/src/app/chat/components/input/ChatInputBar.tsx

@@ -395,19 +298,14 @@ export function ChatInputBar({

  const handleKeyDown = (e: React.KeyboardEvent<HTMLTextAreaElement>) => {
    if (
-      ((showSuggestions && assistantTagOptions.length > 0) || showPrompts) &&
+      (showSuggestions || showPrompts) &&


Enter/Tab handling intercepts keystrokes when showSuggestions is true but no assistant suggestions are rendered, blocking sends after typing a trailing @mention.

(Based on your team's feedback about keeping UX regressions out of refactors and ensuring removed features don't leave dead-end UI states.)

Prompt for AI agents

Address the following comment on web/src/app/chat/components/input/ChatInputBar.tsx at line 301: <comment>Enter/Tab handling intercepts keystrokes when showSuggestions is true but no assistant suggestions are rendered, blocking sends after typing a trailing @mention. (Based on your team's feedback about keeping UX regressions out of refactors and ensuring removed features don't leave dead-end UI states.)</comment> <file context> @@ -395,19 +298,14 @@ export function ChatInputBar({ const handleKeyDown = (e: React.KeyboardEvent<HTMLTextAreaElement>) => { if ( - ((showSuggestions && assistantTagOptions.length > 0) || showPrompts) && + (showSuggestions || showPrompts) && (e.key === "Tab" || e.key == "Enter") ) { </file context>

Suggested change

(showSuggestions || showPrompts) &&

showPrompts &&

cubic-dev-ai · 2025-08-25T22:57:42Z

backend/onyx/prompts/dr_prompts.py

+GENERAL_DR_ANSWER_PROMPT = PromptTemplate(
+    f"""\
+Below you see a user question and potentially an earlier chat history that can be referred to \
+for context. Also, today is {datetime.now().strftime("%Y-%m-%d")}.


Current date is interpolated at import time; should be provided at render time to avoid stale values.

Prompt for AI agents

Address the following comment on backend/onyx/prompts/dr_prompts.py at line 1202: <comment>Current date is interpolated at import time; should be provided at render time to avoid stale values.</comment> <file context> @@ -0,0 +1,1378 @@ +from datetime import datetime + +from onyx.agents.agent_search.dr.constants import MAX_DR_PARALLEL_SEARCH +from onyx.agents.agent_search.dr.enums import DRPath +from onyx.agents.agent_search.dr.enums import ResearchType +from onyx.prompts.prompt_template import PromptTemplate + + +# Standards </file context>

* squash: combine all DR commits into one Co-authored-by: Joachim Rahmfeld <joachim@onyx.app> Co-authored-by: Rei Meguro <rmeguro@umich.edu> * Fixes * show KG in Assistant only if available * KG only usable for KG Beta (for now) * base file upload * raise error if uploaded context is too long * improvements * More improvements * Fix citations * better decision making * improved decision-making in Orchestrator * generic_internal tools * Small tweak * tool use improvements * add on * More image gen stuff * fixes * Small color improvements * Markdown utils * fixed end conditions (incl early exit for image generation) * remove agent search + image fixes * Okta tool support for reload * Some cleanup * Stream back search tool results as they come * tool forcing * fixed no-Tool-Assistant * Support anthropic tool calling * Support anthropic models better * More stuff * prompt fixes and search step numbers * Fix hook ordering issue * internal search fix * Improve citation look * Small UI improvements * Improvements * Improve dot * Small chat fixes * Small UI tweaks * Small improvements * Remove un-used code * Fix * Remove test_answer.py for now * Fix * improvements * Add foreign keys * early forcing * Fix tests * Fix tests --------- Co-authored-by: Joachim Rahmfeld <joachim@onyx.app> Co-authored-by: Rei Meguro <rmeguro@umich.edu> Co-authored-by: joachim-danswer <joachim@danswer.ai>

vercel bot deployed to Preview August 20, 2025 17:44 View deployment

vercel bot deployed to Preview August 20, 2025 21:08 View deployment

vercel bot deployed to Preview August 20, 2025 21:33 View deployment

vercel bot deployed to Preview August 20, 2025 23:15 View deployment

vercel bot deployed to Preview August 20, 2025 23:52 View deployment

vercel bot deployed to Preview August 21, 2025 03:21 View deployment

vercel bot deployed to Preview August 21, 2025 03:36 View deployment

vercel bot deployed to Preview August 21, 2025 19:01 View deployment

vercel bot deployed to Preview August 21, 2025 21:35 View deployment

vercel bot deployed to Preview August 22, 2025 05:22 View deployment

vercel bot deployed to Preview August 22, 2025 18:40 View deployment

vercel bot deployed to Preview August 23, 2025 18:28 View deployment

vercel bot deployed to Preview August 23, 2025 20:47 View deployment

vercel bot deployed to Preview August 23, 2025 21:40 View deployment

vercel bot deployed to Preview August 24, 2025 01:53 View deployment

vercel bot deployed to Preview August 24, 2025 20:16 View deployment

vercel bot deployed to Preview August 24, 2025 21:07 View deployment

vercel bot deployed to Preview August 25, 2025 00:20 View deployment

vercel bot deployed to Preview August 25, 2025 00:43 View deployment

evan-onyx reviewed Aug 25, 2025

View reviewed changes

vercel bot deployed to Preview August 25, 2025 02:37 View deployment

vercel bot deployed to Preview August 25, 2025 03:03 View deployment

vercel bot deployed to Preview August 25, 2025 06:47 View deployment

Weves changed the title ~~Dr merge v2~~ feat: frontend refactor + DR Aug 25, 2025

vercel bot deployed to Preview August 25, 2025 18:12 View deployment

vercel bot deployed to Preview August 25, 2025 18:20 View deployment

vercel bot deployed to Preview August 25, 2025 19:05 View deployment

vercel bot deployed to Preview August 25, 2025 19:08 View deployment

vercel bot deployed to Preview August 25, 2025 19:23 View deployment

Weves requested a review from a team as a code owner August 25, 2025 22:39

cubic-dev-ai bot reviewed Aug 25, 2025

View reviewed changes

Improve dot

3298e64

vercel bot deployed to Preview August 25, 2025 23:37 View deployment

Small chat fixes

044d3c4

vercel bot deployed to Preview August 26, 2025 00:26 View deployment

Small UI tweaks

5b0644f

vercel bot deployed to Preview August 26, 2025 02:09 View deployment

Weves added 2 commits August 25, 2025 19:36

Small improvements

a466369

Remove un-used code

6b1f0fe

vercel bot deployed to Preview August 26, 2025 02:45 View deployment

Weves added 2 commits August 25, 2025 19:46

Fix

f335c21

Remove test_answer.py for now

8ee9ccd

vercel bot deployed to Preview August 26, 2025 02:51 View deployment

Fix

84000ee

vercel bot deployed to Preview August 26, 2025 02:58 View deployment

joachim-danswer and others added 3 commits August 25, 2025 20:46

improvements

b3181d0

Add foreign keys

d3f1c68

Merge branch 'dr-cw' into dr-merge_v2

3ec000a

vercel bot deployed to Preview August 26, 2025 04:44 View deployment

joachim-danswer and others added 2 commits August 25, 2025 21:55

early forcing

9ce78cf

Fix tests

ca3b0c1

vercel bot deployed to Preview August 26, 2025 04:58 View deployment

Merge branch 'dr-cw-1' into dr-merge_v2

46f43ab

vercel bot deployed to Preview August 26, 2025 05:03 View deployment

Fix tests

e6b6309

vercel bot deployed to Preview August 26, 2025 05:45 View deployment

Weves merged commit 9d997e2 into main Aug 26, 2025
10 of 15 checks passed

Weves deleted the dr-merge_v2 branch August 26, 2025 07:26

+                                      writer,
+                                  )
+                              else:

		"""


		"""

feat: frontend refactor + DR #5225

feat: frontend refactor + DR #5225

Uh oh!

Conversation

Weves commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How Has This Been Tested?

Backporting (check the box to trigger backport action)

Summary by cubic

Uh oh!

vercel bot commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

evan-onyx left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

Weves commented Aug 20, 2025 •

edited

Loading

vercel bot commented Aug 20, 2025 •

edited

Loading