Skip to content

Conversation

wenxi-onyx
Copy link
Member

@wenxi-onyx wenxi-onyx commented Sep 8, 2025

Description

  • Update the seeded docs to the new docs landing pages
  • Connector is set to recurse the entire docs site, so if resumed, will index 200+ docs

How Has This Been Tested?

[Describe the tests you ran to verify your changes]

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

@wenxi-onyx wenxi-onyx requested a review from a team as a code owner September 8, 2025 21:32
Copy link

vercel bot commented Sep 8, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
internal-search Ready Ready Preview Comment Sep 8, 2025 9:46pm

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR comprehensively updates Onyx's seeded documentation system from use-case focused content to feature-focused technical documentation. The changes span multiple areas:

Documentation Structure Overhaul: The seeded content has been completely replaced, moving from 7 business-oriented use case documents to 19 comprehensive technical documents covering core features like Actions, Agents, Chat, Code Interpreter, Connectors, and various interfaces. This shift reflects Onyx's evolution from a simple search tool to a full AI platform.

Connector Configuration Update: The base URL for the web connector has been changed from https://docs.onyx.app/more/use_cases to https://docs.onyx.app/, enabling recursive crawling of the entire documentation site. This means the connector will now index 200+ documents instead of just the use cases section.

Infrastructure Modernization: The document seeding preparation script has been refactored from hardcoded content to a data-driven approach using initial_docs_cohere.json as input. This improves maintainability by separating content from code and supports different embedding models.

Cleanup of Legacy References: Outdated telemetry documentation URL comments have been removed from Docker Compose files and Helm values, and regression test ground truth has been updated to point to the new welcome page structure.

The changes integrate well with the existing codebase architecture, maintaining the same seeding interfaces while modernizing the content delivery approach. The shift to comprehensive documentation seeding aligns with providing users immediate access to complete platform documentation rather than just introductory use cases.

Confidence score: 4/5

  • This PR is generally safe to merge with some considerations around the expanded indexing scope
  • Score reflects solid structural improvements but potential risks from significantly expanded connector scope and reduced input validation
  • Pay close attention to backend/onyx/seeding/load_docs.py and backend/scripts/document_seeding_prep.py for scope and error handling implications

8 files reviewed, no comments

Edit Code Review Bot Settings | Greptile

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 9 files

React with 👍 or 👎 to teach cubic. You can also tag @cubic-dev-ai to give feedback, ask questions, or re-run the review.

@wenxi-onyx wenxi-onyx merged commit d248d2f into main Sep 9, 2025
14 of 16 checks passed
@wenxi-onyx wenxi-onyx deleted the whuang/update-seeded-docs branch September 9, 2025 01:06
AnkitTukatek pushed a commit to TukaTek/onyx that referenced this pull request Sep 23, 2025
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant