-
Notifications
You must be signed in to change notification settings - Fork 0
feat: add FAISS vector database support for local-only deployments #41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
amondnet
wants to merge
6
commits into
main
Choose a base branch
from
pleaseai/issue-13-20251012-1535
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Implements FAISS as a zero-configuration, file-based vector database option.
This provides a local alternative to Milvus and Qdrant, perfect for development
and small-to-medium codebases without requiring external infrastructure.
Features:
- File-based persistence (~/.context/faiss-indexes/)
- Hybrid search with dense (FAISS) + sparse (BM25) vectors
- RRF (Reciprocal Rank Fusion) reranking
- Auto-selection when no external DB configured
- Full VectorDatabase interface implementation
Implementation:
- Created FaissVectorDatabase class with IndexFlatL2
- Added FAISS_LOCAL to VectorDatabaseType enum
- Integrated with VectorDatabaseFactory
- Added auto-selection logic in MCP server
- Updated documentation with FAISS quick start
Storage structure:
~/.context/faiss-indexes/{collection}/
├── dense.index # FAISS index file
├── sparse.json # BM25 model
├── metadata.json # Collection metadata
└── documents.json # Document metadata
Limitations:
- Memory-bound (entire index loads into RAM)
- Single-process file access
- Suitable for ~100K files / 1M vectors
For larger codebases, Milvus or Qdrant are recommended.
Closes #13
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Minsu Lee <amondnet@users.noreply.github.com>
Resolved conflicts in: - packages/core/src/vectordb/factory.ts: Merged FAISS support with style changes from main - packages/core/src/vectordb/index.ts: Added FAISS exports with proper ordering - packages/mcp/src/index.ts: Integrated FAISS auto-selection logic with main's formatting Conflict resolution details: - Kept FAISS_LOCAL enum value and FaissVectorDatabase integration - Applied @antfu/eslint-config formatting rules from main (type imports, spacing) - Fixed FAISS add() method to call one vector at a time (faiss-node API requirement) - Updated vectorDbType union to include 'faiss' and 'faiss-local' options - Applied linting fixes to maintain code quality standards All tests pass and linting errors resolved.
Critical fixes from PR review: 1. Fix delete() method - throw NotImplementedError with clear guidance instead of silently failing. FAISS IndexFlatL2 does not support vector deletion. 2. Add comprehensive error handling to file operations: - initialize(): handle EACCES, ENOSPC, ENOENT with specific messages - loadCollection(): wrap each file read with try-catch - saveCollection(): wrap all write operations with error handling 3. Fix BM25 deserialization - use SimpleBM25.fromJSON() API instead of unsafe reflection accessing private properties 4. Update query() JSDoc - document filter parameter limitation with runtime warning when filters are provided 5. Add test coverage - create faiss-vectordb.test.ts with 11 tests covering initialization, CRUD operations, persistence, errors, and hybrid search functionality All critical issues from PR review resolved.
Bug fixes for FAISS vector database implementation: 1. Fix storageDir initialization timing issue - Changed from instance field to getter accessing config - Resolves undefined storageDir during BaseVectorDatabase constructor - Config now properly set with defaults before super() call 2. Fix BM25 serialization for empty collections - Remove check preventing untrained model serialization - Allow empty hybrid collections to be saved/loaded - Fixes "Cannot serialize untrained BM25 model" error 3. Fix FAISS search with empty or small indexes - Check ntotal before searching and return empty array if 0 - Limit topK to min(requested, ntotal) to prevent FAISS error - Apply same fix to both search() and hybridSearch() 4. Add dimension validation on insert - Validate vector dimensions match collection metadata - Throw clear error message on mismatch - Prevents silent failures or FAISS crashes 5. Fix test for permission error handling - Use initializationPromise instead of calling initialize() again - Properly catch constructor-time initialization errors All 14 FAISS tests now pass successfully.
Resolves #43 Make FAISS vector database optional to handle environments where native bindings are not available (e.g., GitHub Actions CI). Changes: - Implement lazy loading for FAISS in factory.ts with try-catch - Add checkFaissAvailability() function with caching - Add VectorDatabaseFactory.isFaissAvailable() static method - Conditionally export FaissVectorDatabase in vectordb/index.ts - Update factory.test.ts to skip FAISS test when unavailable - Throw clear error messages when FAISS requested but unavailable Benefits: - CI tests pass without C++ build tools - No breaking changes to public API - FAISS fully functional when bindings available - Tests validate both scenarios Test Results: - factory.test.ts: 21/21 passed - faiss-vectordb.test.ts: 14/14 passed (when bindings available)
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.


Implements FAISS as a zero-configuration, file-based vector database option. This provides a local alternative to Milvus and Qdrant, perfect for development and small-to-medium codebases without requiring external infrastructure.
Features:
Implementation:
Storage structure:
~/.context/faiss-indexes/{collection}/
├── dense.index # FAISS index file
├── sparse.json # BM25 model
├── metadata.json # Collection metadata
└── documents.json # Document metadata
Limitations:
For larger codebases, Milvus or Qdrant are recommended.
Closes #13
🤖 Generated with Claude Code
Summary
Changes
Problem
Solution
Implementation Details
Trade-offs
Benefits
Test Results
# Commands used to test pnpm build pnpm lint pnpm typecheck pnpm test:allResults:
Test Coverage
Breaking Changes
None / Yes (delete as appropriate)
Checklist
docs/develop/STANDARDS.md)docs/develop/commit-convention.md)pnpm build)pnpm typecheck)pnpm lint)Related Issues
Closes #
Fixes #
Related to #
Screenshots/Demos
Migration Guide
Next Steps