Skip to content

Conversation

Weves
Copy link
Contributor

@Weves Weves commented Aug 29, 2025

Description

^

How Has This Been Tested?

Tested locally

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

Summary by cubic

Shifted the index attempts API to use cc_pair_id instead of connector_id, added a direct access check, and simplified the frontend. This fixes mixed counts/pagination when a connector has multiple CC pairs and removes redundant requests.

  • Bug Fixes

    • Count and paginate attempts by cc_pair_id to avoid mixing data across CC pairs.
    • Enforce user access to the CC pair before returning attempts.
  • Refactors

    • Added verify_user_has_access_to_cc_pair for lightweight permission checks.
    • Simplified DB queries (no join/eager loads) and updated server handlers to use cc_pair_id.
    • Renamed IndexingAttemptsTable to IndexAttemptsTable and removed the extra “most recent attempts” fetch.

@Weves Weves requested a review from a team as a code owner August 29, 2025 21:53
Copy link

vercel bot commented Aug 29, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
internal-search Ready Ready Preview Comment Aug 29, 2025 10:07pm

user: User,
get_editable: bool = True,
) -> bool:
stmt = select(ConnectorCredentialPair.id)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fetch less info than before (likely trivial improvement)

select(IndexAttempt)
.join(ConnectorCredentialPair)
.where(ConnectorCredentialPair.connector_id == connector_id)
stmt = select(IndexAttempt).where(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid unnecessary join

select(IndexAttempt)
.join(ConnectorCredentialPair)
.where(ConnectorCredentialPair.connector_id == connector_id)
stmt = select(IndexAttempt).where(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid unnecessary join

stmt = stmt.offset(page * page_size).limit(page_size)
stmt = stmt.options(
contains_eager(IndexAttempt.connector_credential_pair),
joinedload(IndexAttempt.error_rows),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need this stuff


// need to always have the most recent index attempts around
// so just kick off a separate fetch
const {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR improves the index attempts API by refactoring the pagination system to work directly with connector-credential pair IDs (cc_pair_id) rather than going through connector IDs. The changes span both backend and frontend components to create a more direct and efficient data flow.

On the backend, the core improvement involves updating database functions in backend/onyx/db/index_attempt.py to use cc_pair_id directly instead of requiring joins with the ConnectorCredentialPair table. The count_index_attempts_for_connector function was renamed to count_index_attempts_for_cc_pair and get_paginated_index_attempts_for_cc_pair_id was updated to filter directly on the connector_credential_pair_id field. This eliminates unnecessary table joins and improves query performance.

The API endpoints in backend/onyx/server/documents/cc_pair.py were simplified by introducing a new verify_user_has_access_to_cc_pair function that provides lightweight permission checking without fetching full objects. This replaces the previous approach of using get_connector_credential_pair_from_id_for_user which was overkill for simple access verification.

On the frontend, the changes remove duplicate data fetching in web/src/app/admin/connector/[ccPairId]/page.tsx where two separate usePaginatedFetch hooks were making redundant API calls for the same index attempts data. The component structure was simplified to use a single pagination hook, and the IndexingAttemptsTable component was renamed to IndexAttemptsTable for naming consistency across the codebase.

These changes align with the repository's principles of avoiding duplicate code, preferring simplicity where there's no clear gain from complexity, and maintaining consistency. The refactoring creates a more direct data flow that reduces database queries and API calls while maintaining all existing functionality.

Confidence score: 4/5

  • This PR is safe to merge with minimal risk as it primarily optimizes existing functionality without changing core behavior
  • Score reflects straightforward refactoring that improves performance and code organization while maintaining existing functionality
  • Pay close attention to the database query changes in backend/onyx/db/index_attempt.py to ensure the simplified queries return the same results

Context used:

Rule - When implementing pagination parameters in backend APIs that will be consumed by the frontend's usePaginatedFetch hook, the parameter must be named page_num instead of page. (link)

5 files reviewed, 1 comment

Edit Code Review Bot Settings | Greptile

Comment on lines +284 to +288
stmt = select(ConnectorCredentialPair.id)
stmt = _add_user_filters(stmt, user, get_editable)
stmt = stmt.where(ConnectorCredentialPair.id == cc_pair_id)
result = db_session.execute(stmt)
return result.scalars().first() is not None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Consider using .scalar() instead of .scalars().first() since we're selecting a single column and only need one result

@Weves Weves changed the title Improve index attempts API fix: improve index attempts API Aug 29, 2025
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 5 files

React with 👍 or 👎 to teach cubic. You can also tag @cubic-dev-ai to give feedback, ask questions, or re-run the review.

get_editable: bool = True,
) -> bool:
stmt = select(ConnectorCredentialPair.id)
stmt = _add_user_filters(stmt, user, get_editable)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit access granted to non-curator group members via unguarded get_editable call.

Prompt for AI agents
Address the following comment on backend/onyx/db/connector_credential_pair.py at line 285:

<comment>Edit access granted to non-curator group members via unguarded get_editable call.</comment>

<file context>
@@ -275,6 +275,19 @@ def get_connector_credential_pair_from_id_for_user(
+    get_editable: bool = True,
+) -&gt; bool:
+    stmt = select(ConnectorCredentialPair.id)
+    stmt = _add_user_filters(stmt, user, get_editable)
+    stmt = stmt.where(ConnectorCredentialPair.id == cc_pair_id)
+    result = db_session.execute(stmt)
</file context>

def verify_user_has_access_to_cc_pair(
cc_pair_id: int,
db_session: Session,
user: User,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type annotation excludes None; inconsistent with helpers that accept Optional[User] and may prevent use in anonymous/disabled-auth flows.

Prompt for AI agents
Address the following comment on backend/onyx/db/connector_credential_pair.py at line 281:

<comment>Type annotation excludes None; inconsistent with helpers that accept Optional[User] and may prevent use in anonymous/disabled-auth flows.</comment>

<file context>
@@ -275,6 +275,19 @@ def get_connector_credential_pair_from_id_for_user(
+def verify_user_has_access_to_cc_pair(
+    cc_pair_id: int,
+    db_session: Session,
+    user: User,
+    get_editable: bool = True,
+) -&gt; bool:
</file context>
Suggested change
user: User,
user: User | None,

@Weves Weves enabled auto-merge August 29, 2025 22:19
@Weves Weves disabled auto-merge August 29, 2025 23:11
@Weves Weves merged commit 9ece3b0 into main Aug 29, 2025
16 of 19 checks passed
@Weves Weves deleted the improve-index-page-pagination branch August 29, 2025 23:15
AnkitTukatek pushed a commit to TukaTek/onyx that referenced this pull request Sep 23, 2025
* Improve index attempts API

* Fix import
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants