Skip to content

Conversation

raunakab
Copy link
Contributor

@raunakab raunakab commented Jul 30, 2025

Description

When compiling stats on entities, ungrounded entities were throwing errors because the logic expected that grounded_source_name to be non-null. However, it is possible for grounded_source_name to be null (hence the "ungrounded source"), which caused an error in pydantic serialization.

This PR fixes that issue by assigning a default name to all ungrounded entities when compiling the stats for the frontend.

Addresses: https://linear.app/danswer/issue/DAN-2255/fix-kg-admin-deserialization-serialization-issue.

How Has This Been Tested?

Untested.


Summary by cubic

Fixed errors when compiling entity stats by assigning a default name to ungrounded types before sending data to the frontend, ensuring proper serialization.

  • Bug Fixes
    • Ungrounded entities now use "Ungrounded" as a default name if no grounded source name is present.

Copy link

vercel bot commented Jul 30, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
internal-search ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jul 30, 2025 7:06pm

@raunakab raunakab marked this pull request as ready for review July 30, 2025 18:42
@raunakab raunakab requested a review from a team as a code owner July 30, 2025 18:42
@raunakab raunakab requested a review from evan-onyx July 30, 2025 18:42
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR fixes a Pydantic serialization error in the Knowledge Graph (KG) admin functionality. The issue occurred when compiling entity statistics for the frontend - ungrounded entities in the database can have null grounded_source_name values, but the SourceStatistics Pydantic model expects a non-null string for its source_name field, causing serialization failures.

The fix involves three coordinated changes:

  1. Making the constant public: The _UNGROUNDED_SOURCE_NAME constant in entity_type.py is renamed to UNGROUNDED_SOURCE_NAME to make it publicly accessible across modules.

  2. Adding fallback logic: In entities.py, the get_entity_stats_by_grounded_source_name() function now uses the null coalescing operator (or) to provide the default "Ungrounded" name when grounded_source_name is null.

  3. Improving variable naming: The API endpoint code in kg/api.py uses more descriptive variable names (source_name instead of key) for better code clarity.

This approach maintains consistency with existing patterns in the codebase - the same UNGROUNDED_SOURCE_NAME constant is already used in the get_configured_entity_types() function. The fix ensures that entity statistics can be properly compiled and serialized for the frontend while preserving the semantic meaning that some entities are "ungrounded" (not associated with a specific source).

Confidence score: 5/5

  • This is a safe, well-targeted fix that addresses a specific serialization bug without breaking existing functionality.
  • The solution follows established patterns in the codebase and maintains consistency across the KG system.
  • No files need additional attention - the changes are straightforward and the logic is sound.

3 files reviewed, 1 comment

Edit Code Review Bot Settings | Greptile

Comment on lines 333 to 334
# `row.grounded_source_name`` is NULLABLE in the database schema.
# Thus, for all "ungrounded" entity-types, we use a default name.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: Extra backtick in the comment - should be 'row.grounded_source_name' instead of 'row.grounded_source_name``'

Suggested change
# `row.grounded_source_name`` is NULLABLE in the database schema.
# Thus, for all "ungrounded" entity-types, we use a default name.
# `row.grounded_source_name` is NULLABLE in the database schema.
# Thus, for all "ungrounded" entity-types, we use a default name.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cubic analysis

1 issue found across 3 files • Review in cubic

React with 👍 or 👎 to teach cubic. You can also tag @cubic-dev-ai to give feedback, ask questions, or re-run the review.

Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
Copy link
Contributor

@evan-onyx evan-onyx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@raunakab raunakab added this pull request to the merge queue Jul 30, 2025
Merged via the queue into main with commit f87d3e9 Jul 30, 2025
17 of 18 checks passed
@raunakab raunakab deleted the kg-admin-fix branch July 30, 2025 21:55
wenxi-onyx pushed a commit that referenced this pull request Aug 11, 2025
…ontend (#5133)

* Update names in map-comprehension

* Make default name for ungrounded types public

* Return the default name for ungrounded entity-types

* Update backend/onyx/db/entities.py

Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>

---------

Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
AnkitTukatek pushed a commit to TukaTek/onyx that referenced this pull request Sep 23, 2025
…ontend (onyx-dot-app#5133)

* Update names in map-comprehension

* Make default name for ungrounded types public

* Return the default name for ungrounded entity-types

* Update backend/onyx/db/entities.py

Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>

---------

Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants