-
Notifications
You must be signed in to change notification settings - Fork 1.9k
refactor: update seeded docs #5364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR comprehensively updates Onyx's seeded documentation system from use-case focused content to feature-focused technical documentation. The changes span multiple areas:
Documentation Structure Overhaul: The seeded content has been completely replaced, moving from 7 business-oriented use case documents to 19 comprehensive technical documents covering core features like Actions, Agents, Chat, Code Interpreter, Connectors, and various interfaces. This shift reflects Onyx's evolution from a simple search tool to a full AI platform.
Connector Configuration Update: The base URL for the web connector has been changed from https://docs.onyx.app/more/use_cases
to https://docs.onyx.app/
, enabling recursive crawling of the entire documentation site. This means the connector will now index 200+ documents instead of just the use cases section.
Infrastructure Modernization: The document seeding preparation script has been refactored from hardcoded content to a data-driven approach using initial_docs_cohere.json
as input. This improves maintainability by separating content from code and supports different embedding models.
Cleanup of Legacy References: Outdated telemetry documentation URL comments have been removed from Docker Compose files and Helm values, and regression test ground truth has been updated to point to the new welcome page structure.
The changes integrate well with the existing codebase architecture, maintaining the same seeding interfaces while modernizing the content delivery approach. The shift to comprehensive documentation seeding aligns with providing users immediate access to complete platform documentation rather than just introductory use cases.
Confidence score: 4/5
- This PR is generally safe to merge with some considerations around the expanded indexing scope
- Score reflects solid structural improvements but potential risks from significantly expanded connector scope and reduced input validation
- Pay close attention to
backend/onyx/seeding/load_docs.py
andbackend/scripts/document_seeding_prep.py
for scope and error handling implications
8 files reviewed, no comments
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 9 files
React with 👍 or 👎 to teach cubic. You can also tag @cubic-dev-ai
to give feedback, ask questions, or re-run the review.
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Description
How Has This Been Tested?
[Describe the tests you ran to verify your changes]
Backporting (check the box to trigger backport action)
Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.