Skip to content

Conversation

Weves
Copy link
Contributor

@Weves Weves commented Sep 6, 2025

Description

Fixes https://linear.app/danswer/issue/DAN-2419/merge-prompt-and-persona-table

How Has This Been Tested?

Locally + new tests

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

Summary by cubic

Merges Prompt into Persona and removes the prompt table. Prompt configuration is now embedded in personas; APIs, models, seeding, and tests updated to drop prompt_id and use persona fields.

  • Refactors

    • Database: add system_prompt, task_prompt, include_citations, datetime_aware, is_default_prompt to persona; migrate data; drop prompt and persona__prompt; remove chat_message.prompt_id and backfill persona where possible.
    • Models/API: remove Prompt model and relationships; ChatMessage no longer references a prompt; CreateChatMessageRequest and PersonaUpsertRequest drop prompt_id(s); prompt utils and builders now read from Persona.
    • Seeding: prompts.yml now creates personas with embedded prompt fields; personas.yml accepts prompt_config for these fields.
    • Server: persona create/update endpoints embed prompt fields directly; Slack channel config and assistants API updated.
    • Compatibility: helpers now live in persona.py (get_default_prompt, get_prompts_by_ids, get_prompt_by_name, upsert_prompt) and return/use Persona.
  • Migration

    • Run the Alembic migration to merge prompts into personas and drop legacy tables/columns.
    • Update clients:
      • Stop sending prompt_id in chat and persona APIs; use persona_id and provide system_prompt, task_prompt, include_citations, datetime_aware via persona.
      • Replace any prompt ID references with persona IDs.
    • Ensure a default persona exists; migration marks prior default prompts as is_default_prompt.
    • Review YAML seeds: prompts are now personas; personas.yml can set prompt_config.

Copy link

vercel bot commented Sep 6, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
internal-search Ready Ready Preview Comment Sep 10, 2025 5:21pm

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR implements a major database schema refactoring that merges the separate Prompt and Persona tables into a single unified Persona table. The change eliminates the complexity of managing separate prompt entities by embedding prompt configuration directly within personas.

Key Changes:

  • Database Schema: Adds new columns (system_prompt, task_prompt, datetime_aware, is_default_prompt) to the persona table and removes the prompt table and persona__prompt junction table
  • Migration: Comprehensive Alembic migration (abbfec3a5ac5_merge_prompt_into_persona.py) handles data transfer from prompt/persona__prompt tables to embedded persona fields, updates chat message references, and drops legacy tables
  • API Simplification: Removes prompt_id parameters throughout the codebase - from CreateChatMessageRequest, PersonaUpsertRequest, chat APIs, and assistant endpoints
  • Model Updates: PromptConfig.from_model() now accepts Persona objects instead of Prompt objects; prompt-related functions moved from prompts.py to persona.py with compatibility shims
  • Seeding Changes: Replaces YAML-based prompt loading with programmatic persona definitions in prebuilt_personas.py, eliminating separate prompt entity management
  • Code Cleanup: Updates all prompt access patterns from persona.prompts[0].system_prompt to direct field access persona.system_prompt or "" across agent search, chat utilities, and Slack handlers

The refactoring consolidates what was previously a many-to-many relationship (which in practice was mostly one-to-one) into embedded fields, simplifying queries, reducing joins, and making the API more intuitive. All prompt functionality is preserved but now accessed through the persona context rather than separate entities.

Confidence score: 2/5

  • This PR has significant architectural changes that could cause data corruption or system failures if the migration logic is flawed
  • Critical issues exist in the migration script including incorrect column existence checks and potentially wrong alternate_assistant_id usage for chat message migration
  • Database migration complexity introduces substantial risk during deployment, with potential for data loss if rollback is needed

52 files reviewed, 16 comments

Edit Code Review Bot Settings | Greptile

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

16 issues found across 54 files

React with 👍 or 👎 to teach cubic. You can also tag @cubic-dev-ai to give feedback, ask questions, or re-run the review.

datetime_aware=model.datetime_aware,
system_prompt=override_system_prompt or model.system_prompt or "",
task_prompt=override_task_prompt or model.task_prompt or "",
datetime_aware=model.datetime_aware or True,
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Sep 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using or True forces datetime_aware to True even when the persona explicitly sets False; honor False by using the model value (or only default if None).

Prompt for AI agents
Address the following comment on backend/onyx/chat/models.py at line 283:

<comment>Using `or True` forces datetime_aware to True even when the persona explicitly sets False; honor False by using the model value (or only default if None).</comment>

<file context>
@@ -270,17 +270,17 @@ class PromptConfig(BaseModel):
-            datetime_aware=model.datetime_aware,
+            system_prompt=override_system_prompt or model.system_prompt or &quot;&quot;,
+            task_prompt=override_task_prompt or model.task_prompt or &quot;&quot;,
+            datetime_aware=model.datetime_aware or True,
         )
 
</file context>

def from_model(cls, persona: Persona) -> "PromptSnapshot":
"""Create PromptSnapshot from persona's embedded prompt fields"""
if persona.deleted:
raise ValueError("Persona has been deleted")
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Sep 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raising here breaks allow_deleted behavior: FullPersonaSnapshot.from_model may proceed when allow_deleted=True but this call still raises, causing an unexpected error. Remove the hard raise or thread allow_deleted through.

Prompt for AI agents
Address the following comment on backend/onyx/server/features/persona/models.py at line 34:

<comment>Raising here breaks allow_deleted behavior: FullPersonaSnapshot.from_model may proceed when allow_deleted=True but this call still raises, causing an unexpected error. Remove the hard raise or thread allow_deleted through.</comment>

<file context>
@@ -29,18 +28,19 @@ class PromptSnapshot(BaseModel):
+    def from_model(cls, persona: Persona) -&gt; &quot;PromptSnapshot&quot;:
+        &quot;&quot;&quot;Create PromptSnapshot from persona&#39;s embedded prompt fields&quot;&quot;&quot;
+        if persona.deleted:
+            raise ValueError(&quot;Persona has been deleted&quot;)
 
         return PromptSnapshot(
</file context>

return None


class PydanticListType(TypeDecorator):
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Sep 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing cache_ok on custom TypeDecorator may disable SQLAlchemy type caching and emit warnings; set cache_ok = True on the class.

Prompt for AI agents
Address the following comment on backend/onyx/db/pydantic_type.py at line 35:

<comment>Missing cache_ok on custom TypeDecorator may disable SQLAlchemy type caching and emit warnings; set cache_ok = True on the class.</comment>

<file context>
@@ -30,3 +30,27 @@ def process_result_value(
         return None
+
+
+class PydanticListType(TypeDecorator):
+    impl = JSONB
+
</file context>
Fix with Cubic

if "instructions" in update_data and persona.prompts:
persona.prompts[0].system_prompt = update_data["instructions"]
if "instructions" in update_data:
persona.system_prompt = update_data["instructions"]
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Sep 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modify_assistant assigns list[dict] to persona.tools relationship, causing SQLAlchemy error on commit.

Prompt for AI agents
Address the following comment on backend/onyx/server/openai_assistants_api/asssistants_api.py at line 201:

<comment>modify_assistant assigns list[dict] to persona.tools relationship, causing SQLAlchemy error on commit.</comment>

<file context>
@@ -208,8 +197,8 @@ def modify_assistant(
-    if &quot;instructions&quot; in update_data and persona.prompts:
-        persona.prompts[0].system_prompt = update_data[&quot;instructions&quot;]
+    if &quot;instructions&quot; in update_data:
+        persona.system_prompt = update_data[&quot;instructions&quot;]
 
     db_session.commit()
</file context>
Fix with Cubic

Copy link
Contributor

@evan-onyx evan-onyx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some migration issues and minor code cleanups


# revision identifiers, used by Alembic.
revision = "abbfec3a5ac5"
down_revision = "8818cf73fa1a"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confirm before merge

]
mismatches: list[tuple[str, object, object]] = []

if fetched_persona.name != persona.name:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls make a list of (tag, expected, actual) tuples and iterate over it

@Weves Weves merged commit 458ed93 into main Sep 10, 2025
16 of 17 checks passed
@Weves Weves deleted the merge-prompt-and-persona branch September 10, 2025 17:21
AnkitTukatek pushed a commit to TukaTek/onyx that referenced this pull request Sep 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants