Skip to content

Conversation

wenxi-onyx
Copy link
Member

@wenxi-onyx wenxi-onyx commented Jul 3, 2025

Description

Original: process google docs --> check and invalidate file size --> process non google docs

Fix: check and invalidate file size --> process docs --> process non docs

How Has This Been Tested?

[Describe the tests you ran to verify your changes]

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

@wenxi-onyx wenxi-onyx requested a review from a team as a code owner July 3, 2025 18:44
Copy link

vercel bot commented Jul 3, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
internal-search ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jul 3, 2025 9:12pm

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

Optimized Google Drive file processing by reordering validation checks for better efficiency.

  • Moved file size validation to run before any content processing attempts in backend/onyx/connectors/google_drive/doc_conversion.py to avoid unnecessary processing of files that would be rejected
  • Restructured processing flow to: check size limits → process Google Docs → handle non-Google Doc files
  • Improves resource efficiency by failing fast on size-invalid files before attempting expensive document parsing

1 file reviewed, no comments
Edit PR Review Bot Settings | Greptile

@wenxi-onyx wenxi-onyx added this pull request to the merge queue Jul 10, 2025
Merged via the queue into main with commit 9bd5a1d Jul 10, 2025
14 of 15 checks passed
@wenxi-onyx wenxi-onyx deleted the nit_drive_connector_clarity branch July 10, 2025 02:09
AnkitTukatek pushed a commit to TukaTek/onyx that referenced this pull request Sep 23, 2025
* check file size first and clarify processing logic

* basic gdrive extraction clariy

* typo

---------

Co-authored-by: Wenxi Onyx <wenxi-onyx@Wenxis-MacBook-Pro.local>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants