Skip to content

Conversation

suvodhoy
Copy link
Contributor

@suvodhoy suvodhoy commented Jun 9, 2025

Description

The Discourse connector was incorrectly breaking after processing the first batch of topics (16) even though the API returns 30 topics per page. This caused the connector to miss processing the remaining topics from each page.

How Has This Been Tested?

Noticed a discrepancy in the document count indexed when trying to sync my own discourse setup. I tested by making the changes and building the background service locally and running the sync again.

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

@suvodhoy suvodhoy requested a review from a team as a code owner June 9, 2025 17:56
Copy link

vercel bot commented Jun 9, 2025

@suvodhoy is attempting to deploy a commit to the Danswer Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

Fixed a critical performance bug in the Discourse connector where document indexing was prematurely stopping after the first batch, causing incomplete topic synchronization.

  • Bug fix in backend/onyx/connectors/discourse/connector.py removes incorrect break statement that limited topic processing to first batch size (16) instead of full API page size (30)
  • Improves separation of concerns between topic collection and document batching logic
  • Ensures complete processing of all available topics from each API page
  • Aligns with principle to fail loudly by fixing silent data loss issue

1 file reviewed, no comments
Edit PR Review Bot Settings | Greptile

@suvodhoy suvodhoy force-pushed the bugfix/discourse-topics-batch branch from 0419a42 to c4df090 Compare June 9, 2025 18:48
@Weves Weves merged commit 6dff3d4 into onyx-dot-app:main Jun 9, 2025
1 check failed
@Weves
Copy link
Contributor

Weves commented Jun 9, 2025

Thanks for this fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants