Skip to content

Conversation

evan-onyx
Copy link
Contributor

@evan-onyx evan-onyx commented Apr 15, 2025

Description

Addresses https://linear.app/danswer/issue/DAN-1817/logging-for-cases-where-user-can-see-but-not-download-from-drive

Extra logs for drive plus a small refactor. We've seen some cases where a user can list a file (i.e. they see it exists) but get a 403 when trying to download it. This logging will bring extra clarity about which users this is happening for and help us debug for a longer-term fix.

How Has This Been Tested?

n/a, will rely on connector tests for this since the change is relatively minor

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

@evan-onyx evan-onyx requested a review from a team as a code owner April 15, 2025 17:29
Copy link

vercel bot commented Apr 15, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
internal-search ✅ Ready (Inspect) Visit Preview 💬 Add feedback Apr 15, 2025 6:16pm

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

This PR enhances logging and error handling for Google Drive permission-related issues, particularly focusing on the 403 error scenario where users can see but not download files.

  • Added debug logging in backend/onyx/connectors/google_drive/doc_conversion.py to capture 403 errors with retriever_email context
  • Refactored service creation in doc_conversion.py to use lazy evaluation for better resource management
  • Enhanced error messages in backend/onyx/connectors/google_drive/connector.py to include retrieval stage, user email, and parent drive/folder ID
  • Removed unused imports and functions to improve code cleanliness
  • Standardized web view link handling with new WEB_VIEW_LINK_KEY constant

2 file(s) reviewed, 2 comment(s)
Edit PR Review Bot Settings | Greptile

Main entry point for converting a Google Drive file => Document object.
"""
doc_id = ""
doc_id = file.get(WEB_VIEW_LINK_KEY, "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: doc_id is overwritten on line 376, making this initial assignment redundant

Suggested change
doc_id = file.get(WEB_VIEW_LINK_KEY, "")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is greptile right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately not (the assignment on 376 is inside the try-except, so this assignment is used in cases where we error out before then). Good try Greptile 🥲

Comment on lines 396 to 402
if isinstance(e, HttpError) and e.status_code == 403:
logger.debug(
f"Uncommon permissions error while downloading file. User "
f"{retriever_email} was able to see file {file_name} "
"but cannot download it."
)
logger.debug(error_str)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: using debug level for permission errors might cause important troubleshooting info to be missed - consider using warning level instead

file_name = file.get("name")
error_str = f"Error converting file '{file_name}' to Document: {e}"
if isinstance(e, HttpError) and e.status_code == 403:
logger.debug(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be warning?

Main entry point for converting a Google Drive file => Document object.
"""
doc_id = ""
doc_id = file.get(WEB_VIEW_LINK_KEY, "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is greptile right?

@evan-onyx evan-onyx enabled auto-merge April 15, 2025 18:12
@evan-onyx evan-onyx added this pull request to the merge queue Apr 15, 2025
Merged via the queue into main with commit a8cba7a Apr 15, 2025
10 of 12 checks passed
@evan-onyx evan-onyx deleted the drive-permission-bug-logging branch April 15, 2025 19:37
aronszanto pushed a commit to aronszanto/onyx that referenced this pull request Apr 26, 2025
* extra logging for uncommon permissions cases

* address CW comments
AnkitTukatek pushed a commit to TukaTek/onyx that referenced this pull request Sep 23, 2025
* extra logging for uncommon permissions cases

* address CW comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants