Skip to content

Buffered writes #122

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 31 commits into
base: master
Choose a base branch
from
Open

Conversation

tharropoulos
Copy link
Collaborator

TLDR

Add buffer-based operations for Typesense to improve reliability and performance.

Change Summary

  • This change adds a buffering layer to improve reliability of Typesense operations
  • Buffer stores operations in Firestore before processing them in batches
  • Provides retry capabilities for failed requests and better error handling
  • Configurable via extension parameters for flexibility

Added Features:

  1. Buffer functionality for Typesense operations:

    • Added option to use buffering for Typesense operations with a new config parameter: TYPESENSE_USE_BUFFER
    • Buffer stores operations in Firestore before processing them in batches
    • Implemented automatic retry mechanism for failed operations
  2. New Configuration Parameters in extension.yaml:

    • TYPESENSE_USE_BUFFER: Toggle buffering on/off
    • TYPESENSE_BUFFER_COLLECTION_IN_FIRESTORE: Collection name for buffer storage
    • TYPESENSE_BUFFER_BATCH_SIZE: Number of documents to process in a batch
    • TYPESENSE_BUFFER_MAX_RETRIES: Maximum retry attempts for failed operations
    • TYPESENSE_BUFFER_FLUSH_INTERVAL: Cron-style interval for buffer processing
  3. New files:

    • processBuffer.js: Scheduled function to process buffered operations
    • Added test files: indexOnWriteWithBuffer.spec.js and indexOnWriteSubcollectionWithBuffer.spec.js

Code Changes:

  1. In indexOnWrite.js:

    • Modified to check for buffer setting and route operations accordingly
    • Split functionality into realTimeWrites and bufferedWrites
    • Added handling for path parameters in buffered operations
  2. In processBuffer.js:

    • Implemented scheduled function to process buffer in batches
    • Added retry logic for failed operations
    • Added status tracking (pending → processing → completed/retrying → failed)
  3. In config.js:

    • Added new configuration parameters with defaults
  4. Test Environment Updates:

    • Added support for custom Typesense fields in test environment
    • Created comprehensive test suites for buffered operations
    • Added test environment configurations for buffer testing
  5. Fix: Double URL Encoding:

    • Removed redundant encodeURIComponent calls when accessing Typesense collections after updating typesense-js
    • Updated all test files to match this change

Dependencies:

  1. Updated Typesense client:
    • Bumped from v1.8.2 to v2.1.0-3 for improved functionality

PR Checklist

- Add `typesenseBufferCollectionInFirestore` to specify Firestore collection for buffering
- Add `typesenseBufferBatchSize` to control batch size for buffer processing
- Add `typesenseBufferFlushInterval` with default of 3 minutes for buffer flushing
- add scheduled function to process pending operations in buffer
- handle batch upserts and deletes to typesense collection
- implement retry mechanism for failed operations
- track operation status with firestore updates
- move the function logic into a separate function for invoking on test
  files
- Typesense will not return the ids if none are deleted
- Add `typesenseFields` parameter to `TestEnvironment` constructor
- Store custom fields in class instance variable
- Use custom fields when creating typesense collection
- Default to wildcard field config when no custom fields provided
- Create test suite to verify typesense buffer processing functionality
- Test successful upsert operations from buffer to typesense
- Test successful delete operations from buffer to typesense
- Implement retry logic tests for failed operations
- Verify proper status transitions (pending → completed/retrying → failed)
- split monolithic function into smaller, focused helper functions
- rename main function from `based` to descriptive `processTypesenseBuffer`
- add jsdoc comments to document function purposes and parameters
- extract common patterns into reusable functions like `markDocumentsAsCompleted`
- improve error handling with consistent status update pattern
- rename filter function from `filterByDelete` to `createDeleteFilter`
- after upgrading to 2.1.0, encoding is happening on typesense's side
@tharropoulos
Copy link
Collaborator Author

The tests are going to be failing because the buffered deletion feature relies on this PR. After a new tag is released on Dockerhub, I'll update the Typesense version running on CI.

@tharropoulos tharropoulos marked this pull request as draft April 9, 2025 11:34
@tharropoulos tharropoulos requested a review from jasonbosco April 9, 2025 11:35
@tharropoulos tharropoulos marked this pull request as ready for review April 14, 2025 11:39
@tharropoulos
Copy link
Collaborator Author

@jasonbosco Ready for review! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant