Skip to content

Conversation

trial-danswer
Copy link
Collaborator

@trial-danswer trial-danswer commented Sep 11, 2025

Description

Partial completion of: https://linear.app/danswer/issue/DAN-2494/sync-permissions-database-and-backend-updates

Adding two two tables related to auto sync permissions:

  1. doc_permission_sync_attempt
  2. external_group_permission_sync_attempt

These will be used to track history of sync attempts and display them to the user.

PR Stack:

  1. feat: tables/migration for permission syncing attempts -> you are here
  2. feat: basic db methods to create and update permission sync attempts
  3. feat: modify existing permission sync tasks workflow to create/update attempt records
  4. feat: read latest permission sync from the frontend

How Has This Been Tested?

Ran a successful upgrade/downgrade via:

alembic upgrade 03d710ccf29c
alembic downgrade b7ec9b5b505f

And we see the tables created on upgrade:

postgres=# \d doc_permission_sync_attempt
                                                Table "public.doc_permission_sync_attempt"
            Column            |           Type           | Collation | Nullable |                         Default
------------------------------+--------------------------+-----------+----------+---------------------------------------------------------
 id                           | integer                  |           | not null | nextval('doc_permission_sync_attempt_id_seq'::regclass)
 connector_credential_pair_id | integer                  |           | not null |
 status                       | character varying(21)    |           | not null |
 total_docs_synced            | integer                  |           |          |
 docs_with_permission_errors  | integer                  |           |          |
 time_created                 | timestamp with time zone |           | not null | now()
 time_started                 | timestamp with time zone |           |          |
 time_finished                | timestamp with time zone |           |          |
Indexes:
    "doc_permission_sync_attempt_pkey" PRIMARY KEY, btree (id)
    "ix_doc_permission_sync_attempt_time_created" btree (time_created)
    "ix_permission_sync_attempt_latest_for_cc_pair" btree (connector_credential_pair_id, time_created)
    "ix_permission_sync_attempt_status_time" btree (status, time_finished DESC)
Foreign-key constraints:
    "doc_permission_sync_attempt_connector_credential_pair_id_fkey" FOREIGN KEY (connector_credential_pair_id) REFERENCES connector_credential_pair(id)

postgres=# \d external_group_permission_sync_attempt
                                                 Table "public.external_group_permission_sync_attempt"
             Column             |           Type           | Collation | Nullable |                              Default
--------------------------------+--------------------------+-----------+----------+--------------------------------------------------------------------
 id                             | integer                  |           | not null | nextval('external_group_permission_sync_attempt_id_seq'::regclass)
 connector_credential_pair_id   | integer                  |           |          |
 status                         | character varying(21)    |           | not null |
 total_users_processed          | integer                  |           |          |
 total_groups_processed         | integer                  |           |          |
 total_group_memberships_synced | integer                  |           |          |
 time_created                   | timestamp with time zone |           | not null | now()
 time_started                   | timestamp with time zone |           |          |
 time_finished                  | timestamp with time zone |           |          |
Indexes:
    "external_group_permission_sync_attempt_pkey" PRIMARY KEY, btree (id)
    "ix_external_group_permission_sync_attempt_time_created" btree (time_created)
    "ix_group_sync_attempt_cc_pair_time" btree (connector_credential_pair_id, time_created)
    "ix_group_sync_attempt_status_time" btree (status, time_finished DESC)
Foreign-key constraints:
    "external_group_permission_syn_connector_credential_pair_id_fkey" FOREIGN KEY (connector_credential_pair_id) REFERENCES connector_credential_pair(id)

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

Copy link

vercel bot commented Sep 11, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
internal-search Ready Ready Preview Comment Sep 12, 2025 11:10pm

@trial-danswer trial-danswer changed the title tables/migration for permission syncing attempts feat: tables/migration for permission syncing attempts Sep 11, 2025
@trial-danswer trial-danswer force-pushed the nikg/permission-sync/db-models-migrations branch from 15d2ac3 to 2a61ed7 Compare September 11, 2025 21:20
@trial-danswer trial-danswer force-pushed the nikg/permission-sync/db-models-migrations branch from 2a61ed7 to 3f1712a Compare September 11, 2025 22:00
@trial-danswer trial-danswer force-pushed the nikg/permission-sync/db-models-migrations branch from 3f1712a to c2188a5 Compare September 11, 2025 22:04
@trial-danswer trial-danswer force-pushed the nikg/permission-sync/db-models-migrations branch from c2188a5 to 4db3185 Compare September 11, 2025 22:27
@trial-danswer trial-danswer force-pushed the nikg/permission-sync/db-models-migrations branch from 4db3185 to 838f1eb Compare September 12, 2025 00:14
@trial-danswer trial-danswer marked this pull request as ready for review September 12, 2025 00:16
@trial-danswer trial-danswer requested a review from a team as a code owner September 12, 2025 00:16
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR introduces database infrastructure for tracking permission synchronization attempts by adding two new tables and supporting code changes. The changes implement:

New Database Tables:

  1. doc_permission_sync_attempt - Tracks document-level permission syncing with fields for total docs synced and permission errors
  2. external_group_permission_sync_attempt - Tracks group membership syncing with fields for users processed, groups processed, and memberships synced

Supporting Code Changes:

  • Adds PermissionSyncStatus enum in backend/onyx/db/enums.py with states: NOT_STARTED, IN_PROGRESS, SUCCESS, CANCELED, FAILED, COMPLETED_WITH_ERRORS
  • Defines corresponding SQLAlchemy models in backend/onyx/db/models.py with proper foreign key relationships to connector_credential_pair
  • Creates Alembic migration script 03d710ccf29c_add_permission_sync_attempt_tables.py for schema changes

Both tables follow the established pattern of the existing IndexAttempt model, including comprehensive indexing for efficient queries by status, time, and connector credential pair. The external_group_permission_sync_attempt table allows nullable connector_credential_pair_id to support global group syncs, while doc_permission_sync_attempt requires a specific connector pair since document permissions are always tied to a source.

This infrastructure enables the system to track sync attempt history and provide visibility to users about permission synchronization operations, supporting the broader permission auto-sync framework referenced in Linear ticket DAN-2494.

Confidence score: 4/5

  • This PR is generally safe to merge but contains a few issues that should be addressed
  • Score reflects well-structured database design with proper patterns but inconsistencies in migration constraints and hardcoded enum values
  • Pay close attention to the migration file for constraint inconsistencies and enum synchronization

3 files reviewed, 2 comments

Edit Code Review Bot Settings | Greptile

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 3 files

Copy link
Contributor

@justin-tahara justin-tahara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comments

@justin-tahara
Copy link
Contributor

@greptileai

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR introduces a permission sync tracking system by adding two new database tables and supporting infrastructure. The changes include:

  1. New Database Models: Two tables are added to track different types of permission sync attempts:

    • DocPermissionSyncAttempt: Tracks document-level permission syncing attempts, always tied to a specific connector-credential pair (NOT NULL constraint)
    • ExternalGroupPermissionSyncAttempt: Tracks external group membership syncing attempts, which can be global operations (nullable connector-credential pair ID)
  2. Status Management: A new PermissionSyncStatus enum is added to backend/onyx/db/enums.py with six states (NOT_STARTED, IN_PROGRESS, SUCCESS, CANCELED, FAILED, COMPLETED_WITH_ERRORS) and helper methods is_terminal() and is_successful() that mirror the existing IndexingStatus pattern.

  3. Database Schema: The migration creates comprehensive indexing strategies for efficient querying by status, time, and connector-credential pair combinations. Both tables include progress tracking fields (e.g., total_docs_synced, docs_with_permission_errors for document sync; total_users_processed, total_groups_processed for group sync).

  4. SQLAlchemy Models: Corresponding model classes are added to backend/onyx/db/models.py with proper foreign key relationships and helper methods.

This foundation supports the broader permission syncing workflow mentioned in the PR stack, where users will be able to view sync history and errors through future API endpoints. The design follows established patterns in the codebase similar to IndexAttempt tracking.

Confidence score: 4/5

  • This PR is safe to merge with minimal risk as it only adds new database infrastructure without modifying existing functionality
  • Score reflects well-structured database design following established patterns, but minor concerns about enum duplication and design consistency
  • Pay close attention to the migration file for potential enum duplication issues and verify the nullable constraint design is intentional

3 files reviewed, no comments

Edit Code Review Bot Settings | Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants