Skip to content

Conversation

jessicasingh7
Copy link
Contributor

@jessicasingh7 jessicasingh7 commented Sep 16, 2025

Description

[Provide a brief description of the changes in this PR]

How Has This Been Tested?

[Describe the tests you ran to verify your changes]

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

@jessicasingh7 jessicasingh7 requested a review from a team as a code owner September 16, 2025 20:02
Copy link

vercel bot commented Sep 16, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
internal-search Ready Ready Preview Comment Sep 16, 2025 11:43pm

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR fixes a critical bug in the Slack connector's checkpoint handling that was causing incorrect message retrieval boundaries during incremental indexing. The fix involves two key corrections in the _get_all_doc_ids method:

  1. Line 840: Changed latest = channel_message_ts to oldest = channel_message_ts when loading from checkpoint
  2. Line 858: Changed message_batch[-1]["ts"] to message_batch[0]["ts"] when determining the new checkpoint timestamp

The root cause was a misunderstanding of Slack's API behavior and checkpoint semantics. Slack's API returns messages from newest to oldest, so when resuming from a checkpoint, the stored timestamp represents the oldest message previously processed. The connector should continue from this point by setting oldest = channel_message_ts, not latest. Similarly, when updating the checkpoint for the next run, it should use the timestamp of the first (newest) message in the batch, not the last (oldest).

This bug would have caused either message gaps (missing messages between runs) or duplicates during incremental indexing, which is critical for maintaining search index consistency. The fix aligns the implementation with the documented API behavior and ensures proper checkpoint-based message retrieval.

Confidence score: 5/5

  • This PR is extremely safe to merge with virtually no risk of causing issues
  • Score reflects a simple but critical bug fix with clear logic and minimal scope of change
  • No files require special attention as the changes are straightforward and well-targeted

1 file reviewed, no comments

Edit Code Review Bot Settings | Greptile

@jessicasingh7 jessicasingh7 changed the title bug(slack): swapped checkpoint index fix(slack): swapped checkpoint index Sep 16, 2025
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

@Weves Weves merged commit 24831fa into main Sep 18, 2025
53 of 55 checks passed
@Weves Weves deleted the jessica/slack-search-index-bug branch September 18, 2025 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants