-
Notifications
You must be signed in to change notification settings - Fork 2.1k
feat: make sharepoint documents and sharepoint pages optional #5183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: make sharepoint documents and sharepoint pages optional #5183
Conversation
Someone is attempting to deploy a commit to the Danswer Team on Vercel. A member of the Team first needs to authorize it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR enhances the SharePoint connector to provide granular control over content indexing by adding a new include_site_documents
parameter alongside the existing include_site_pages
parameter. Previously, the SharePoint connector would always index documents from SharePoint document libraries, with only site pages being optional. Now users can independently control whether to index SharePoint documents and/or SharePoint site pages (.aspx files).
The implementation adds the include_site_documents: bool = True
parameter to the SharepointConnector
constructor, maintaining backward compatibility by defaulting to True
. The document processing logic in _fetch_from_sharepoint()
is now wrapped in a conditional check (if self.include_site_documents:
) that controls whether drive items are fetched and processed into documents. This follows the same pattern already established for site pages with the existing include_site_pages
parameter.
The frontend changes add corresponding UI elements in the connector configuration form, providing two checkboxes for "Index Documents" and "Index ASPX Sites" in an advanced configuration section. Both options default to enabled, preserving existing behavior while giving users the flexibility to selectively index content types based on their specific needs.
The test suite is updated to explicitly specify both parameters in SharePoint connector instantiation, ensuring tests continue to work correctly with the expanded API. This change integrates well with the existing connector architecture and maintains the established patterns for optional content selection in Onyx connectors.
Confidence score: 5/5
- This PR is safe to merge with minimal risk as it adds optional functionality with sensible defaults
- Score reflects simple, well-contained changes that follow existing patterns and maintain backward compatibility
- No files require special attention as the changes are straightforward and properly tested
3 files reviewed, no comments
Hey @wenxi-onyx - Looks good from my side. Originally it were just one commit with 3 files changed. |
cbff29a
to
29f4d89
Compare
Hey @nsklei, would you mind fixing one last issue I found in this connector (not caused by you)? Could you please add
At the start of _fetch_site_pages() Thank you so much!! |
52c5757
to
6d80fa2
Compare
…ot-app#5183) * feat: make sharepoint documents and sharepoint pages optional * fix: address review feedback for PR onyx-dot-app#5183 * fix: exclude personal sites from sharepoint connector --------- Co-authored-by: Nils Kleinrahm <nils.kleinrahm@pledoc.de>
The Feature allows users to choose whether they want to index only SharePoint pages, only SharePoint documents, or both.
The PR adds the backend logic to the sharepoint connector.py and the necessary UI elements to the connectors.tsx.
How Has This Been Tested?
include_site_pages: bool = True, include_site_documents: bool = True
include_site_pages: bool = True, include_site_documents: bool = False
include_site_pages: bool = False, include_site_documents: bool = True## Description
I tested the UI elements and its functionality manually by building the images locally and starting the application.
I created three Sharepoint connectors with permutation of all three options (see below) and tested them in an assistant.
Backporting (check the box to trigger backport action)
Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.
Summary by cubic
Added options to let users choose whether to index SharePoint documents, SharePoint pages, or both when setting up a connector.