Skip to content

Conversation

Subash-Mohan
Copy link
Contributor

@Subash-Mohan Subash-Mohan commented Aug 9, 2025

Description

This pull request updates the logic in the get_user_files_as_user function to improve how user file access is validated. Instead of raising an error when a file does not belong to the user, it now filters out files not owned by the user and only returns those the user has access to.

Linear issue -> https://linear.app/danswer/issue/DAN-2201/trying-to-chat-with-recent-documents-fails

How Has This Been Tested?

Tested from UI

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

Summary by cubic

Updated file access logic to ensure users only see files they own, filtering out files from other users.

@Subash-Mohan Subash-Mohan requested a review from a team as a code owner August 9, 2025 07:17
Copy link

vercel bot commented Aug 9, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
internal-search ✅ Ready (Inspect) Visit Preview 💬 Add feedback Aug 10, 2025 8:39am

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR modifies the get_user_files_as_user function in backend/onyx/file_store/utils.py to change how user file access validation is handled. Previously, the function would raise a ValueError when a user attempted to access files they didn't own. The new implementation silently filters out unauthorized files and returns only those owned by the requesting user.

The change replaces error-throwing behavior with a filtering approach: instead of validating access permissions upfront and failing fast, the function now iterates through all retrieved files and only includes those where user_file.user_id == user_id in the returned list. This addresses a specific issue where the chat system was failing when users tried to access recent documents that included files from other users.

This function is part of the file store utilities layer and is likely called by chat-related endpoints when users request access to files. The modification represents a shift from strict access control enforcement to a more permissive approach that prioritizes system stability over immediate error feedback.

Confidence score: 2/5

  • This PR introduces a potentially problematic change that silently masks authorization failures instead of properly handling them
  • Score reflects concerns about security implications and the difficulty of debugging access control issues when errors are silently suppressed
  • Pay close attention to backend/onyx/file_store/utils.py and consider if this change aligns with security requirements

1 file reviewed, 2 comments

Edit Code Review Bot Settings | Greptile

Comment on lines 247 to 254
current_user_files = []
for user_file in user_files:
# Note: if user_id is None, then all files should be None as well
# (since auth must be disabled in this case)
if user_file.user_id != user_id:
raise ValueError(
f"User {user_id} does not have access to file {user_file.id}"
)
return user_files
if user_file.user_id == user_id:
current_user_files.append(user_file)

return current_user_files
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Silent filtering of unauthorized files could mask security issues. The original error-raising behavior provided explicit feedback when access control violations occurred. Consider logging when files are filtered out for security auditing.

Comment on lines 249 to 250
# Note: if user_id is None, then all files should be None as well
# (since auth must be disabled in this case)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: The comment about user_id being None when auth is disabled suggests this function handles both authenticated and unauthenticated scenarios, but the filtering logic doesn't account for this properly - it would filter out all files when user_id is None.

Suggested change
# Note: if user_id is None, then all files should be None as well
# (since auth must be disabled in this case)
current_user_files = []
for user_file in user_files:
# Note: if user_id is None, then all files should be None as well
# (since auth must be disabled in this case)
if user_file.user_id == user_id:
current_user_files.append(user_file)
return current_user_files

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cubic analysis

No issues found across 1 file. Review in cubic

@Weves Weves added this pull request to the merge queue Aug 10, 2025
Merged via the queue into main with commit 8cec344 Aug 10, 2025
15 checks passed
@Weves Weves deleted the fix-recent-document-chat-fails branch August 10, 2025 20:08
wenxi-onyx pushed a commit that referenced this pull request Aug 11, 2025
* fix: restrict user file access to current user only

* fix: enhance user file access control for recent folder
AnkitTukatek pushed a commit to TukaTek/onyx that referenced this pull request Sep 23, 2025
* fix: restrict user file access to current user only

* fix: enhance user file access control for recent folder
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants