Skip to content

Conversation

@amondnet
Copy link
Contributor

Implements FAISS as a zero-configuration, file-based vector database option. This provides a local alternative to Milvus and Qdrant, perfect for development and small-to-medium codebases without requiring external infrastructure.

Features:

  • File-based persistence (~/.context/faiss-indexes/)
  • Hybrid search with dense (FAISS) + sparse (BM25) vectors
  • RRF (Reciprocal Rank Fusion) reranking
  • Auto-selection when no external DB configured
  • Full VectorDatabase interface implementation

Implementation:

  • Created FaissVectorDatabase class with IndexFlatL2
  • Added FAISS_LOCAL to VectorDatabaseType enum
  • Integrated with VectorDatabaseFactory
  • Added auto-selection logic in MCP server
  • Updated documentation with FAISS quick start

Storage structure:
~/.context/faiss-indexes/{collection}/
├── dense.index # FAISS index file
├── sparse.json # BM25 model
├── metadata.json # Collection metadata
└── documents.json # Document metadata

Limitations:

  • Memory-bound (entire index loads into RAM)
  • Single-process file access
  • Suitable for ~100K files / 1M vectors

For larger codebases, Milvus or Qdrant are recommended.

Closes #13

🤖 Generated with Claude Code

Summary

Changes

  • Change 1
  • Change 2
  • Change 3

Problem

Solution

Implementation Details

Trade-offs

Benefits

Test Results

# Commands used to test
pnpm build
pnpm lint
pnpm typecheck
pnpm test:all

Results:

  • All tests passing
  • Build successful
  • Type checking passing
  • Lint passing (0 errors)

Test Coverage

  • Core functionality tested
  • Edge cases covered
  • Error scenarios handled
  • Integration tests added

Breaking Changes

None / Yes (delete as appropriate)

Checklist

  • Code follows project conventions (see docs/develop/STANDARDS.md)
  • Commit messages follow conventional commits (see docs/develop/commit-convention.md)
  • Tests added/updated and all passing
  • Build passes (pnpm build)
  • Type checking passes (pnpm typecheck)
  • Lint passes with 0 errors (pnpm lint)
  • Documentation updated (if needed)
  • Related issue linked (if applicable)

Related Issues

Closes #
Fixes #
Related to #

Screenshots/Demos

Migration Guide

Next Steps

  • Task 1
  • Task 2

github-actions bot and others added 4 commits October 12, 2025 15:41
Implements FAISS as a zero-configuration, file-based vector database option.
This provides a local alternative to Milvus and Qdrant, perfect for development
and small-to-medium codebases without requiring external infrastructure.

Features:
- File-based persistence (~/.context/faiss-indexes/)
- Hybrid search with dense (FAISS) + sparse (BM25) vectors
- RRF (Reciprocal Rank Fusion) reranking
- Auto-selection when no external DB configured
- Full VectorDatabase interface implementation

Implementation:
- Created FaissVectorDatabase class with IndexFlatL2
- Added FAISS_LOCAL to VectorDatabaseType enum
- Integrated with VectorDatabaseFactory
- Added auto-selection logic in MCP server
- Updated documentation with FAISS quick start

Storage structure:
~/.context/faiss-indexes/{collection}/
  ├── dense.index     # FAISS index file
  ├── sparse.json     # BM25 model
  ├── metadata.json   # Collection metadata
  └── documents.json  # Document metadata

Limitations:
- Memory-bound (entire index loads into RAM)
- Single-process file access
- Suitable for ~100K files / 1M vectors

For larger codebases, Milvus or Qdrant are recommended.

Closes #13

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Minsu Lee <amondnet@users.noreply.github.com>
Resolved conflicts in:
- packages/core/src/vectordb/factory.ts: Merged FAISS support with style changes from main
- packages/core/src/vectordb/index.ts: Added FAISS exports with proper ordering
- packages/mcp/src/index.ts: Integrated FAISS auto-selection logic with main's formatting

Conflict resolution details:
- Kept FAISS_LOCAL enum value and FaissVectorDatabase integration
- Applied @antfu/eslint-config formatting rules from main (type imports, spacing)
- Fixed FAISS add() method to call one vector at a time (faiss-node API requirement)
- Updated vectorDbType union to include 'faiss' and 'faiss-local' options
- Applied linting fixes to maintain code quality standards

All tests pass and linting errors resolved.
Critical fixes from PR review:

1. Fix delete() method - throw NotImplementedError with clear guidance
   instead of silently failing. FAISS IndexFlatL2 does not support
   vector deletion.

2. Add comprehensive error handling to file operations:
   - initialize(): handle EACCES, ENOSPC, ENOENT with specific messages
   - loadCollection(): wrap each file read with try-catch
   - saveCollection(): wrap all write operations with error handling

3. Fix BM25 deserialization - use SimpleBM25.fromJSON() API instead
   of unsafe reflection accessing private properties

4. Update query() JSDoc - document filter parameter limitation with
   runtime warning when filters are provided

5. Add test coverage - create faiss-vectordb.test.ts with 11 tests
   covering initialization, CRUD operations, persistence, errors, and
   hybrid search functionality

All critical issues from PR review resolved.
@amondnet amondnet self-assigned this Oct 31, 2025
Bug fixes for FAISS vector database implementation:

1. Fix storageDir initialization timing issue
   - Changed from instance field to getter accessing config
   - Resolves undefined storageDir during BaseVectorDatabase constructor
   - Config now properly set with defaults before super() call

2. Fix BM25 serialization for empty collections
   - Remove check preventing untrained model serialization
   - Allow empty hybrid collections to be saved/loaded
   - Fixes "Cannot serialize untrained BM25 model" error

3. Fix FAISS search with empty or small indexes
   - Check ntotal before searching and return empty array if 0
   - Limit topK to min(requested, ntotal) to prevent FAISS error
   - Apply same fix to both search() and hybridSearch()

4. Add dimension validation on insert
   - Validate vector dimensions match collection metadata
   - Throw clear error message on mismatch
   - Prevents silent failures or FAISS crashes

5. Fix test for permission error handling
   - Use initializationPromise instead of calling initialize() again
   - Properly catch constructor-time initialization errors

All 14 FAISS tests now pass successfully.
Resolves #43

Make FAISS vector database optional to handle environments where
native bindings are not available (e.g., GitHub Actions CI).

Changes:
- Implement lazy loading for FAISS in factory.ts with try-catch
- Add checkFaissAvailability() function with caching
- Add VectorDatabaseFactory.isFaissAvailable() static method
- Conditionally export FaissVectorDatabase in vectordb/index.ts
- Update factory.test.ts to skip FAISS test when unavailable
- Throw clear error messages when FAISS requested but unavailable

Benefits:
- CI tests pass without C++ build tools
- No breaking changes to public API
- FAISS fully functional when bindings available
- Tests validate both scenarios

Test Results:
- factory.test.ts: 21/21 passed
- faiss-vectordb.test.ts: 14/14 passed (when bindings available)
@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
1 Security Hotspot
6.6% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add FAISS vector database support for local-only deployments

2 participants