Skip to content

Second Brain v4.2.0 - Automatic Embeddings & Enhanced Search πŸš€

Compare
Choose a tag to compare
@raold raold released this 03 Aug 14:29
· 137 commits to main since this release

Second Brain v4.2.0 Release Notes

Release Date: August 6, 2025
Status: Production Ready βœ…

🎯 Overview

Second Brain v4.2.0 delivers automatic embedding generation and enhanced vector search capabilities with a simplified, single-user focused architecture. This release emphasizes performance, reliability, and ease of use.

πŸš€ Major Features

1. Automatic Embedding Generation ✨

  • Embeddings are now generated automatically when memories are created
  • Configurable via ENABLE_EMBEDDINGS environment variable (default: true)
  • Asynchronous processing for optimal performance
  • Fixed vector format issues for PostgreSQL compatibility

2. Enhanced Search Capabilities πŸ”

  • Vector Search: Semantic similarity search with sub-3ms latency
  • Hybrid Search: Combined vector and text search with configurable weighting
  • Knowledge Graphs: Build relationship graphs around memories
  • Duplicate Detection: Identify similar memories automatically
  • Search Suggestions: Auto-complete for search queries

3. Performance Improvements ⚑

  • Vector search: 2.27ms mean, 3.09ms p95
  • Text search: 1.34ms mean, 1.69ms p95
  • Hybrid search: 1.82ms mean, 2.69ms p95
  • Nearly linear scalability (2.5x time for 10x data)
  • HNSW indexes for 95% faster similarity search

4. Simplified Architecture πŸ—οΈ

  • Single-user focused design
  • Removed all migration scripts and backward compatibility
  • Simplified CI/CD pipeline that actually passes
  • Clean codebase without legacy cruft

πŸ“Š Technical Details

Database Schema

  • PostgreSQL 16 with pgvector extension
  • HNSW indexes for fast vector similarity search
  • Full-text search with GIN indexes
  • Hybrid search SQL function with proper type casting

API Endpoints (v4.2.0)

POST   /api/v2/memories/          # Create memory (auto-generates embeddings)
GET    /api/v2/memories/{id}      # Get specific memory
GET    /api/v2/memories/          # List memories with filters
PATCH  /api/v2/memories/{id}      # Update memory
DELETE /api/v2/memories/{id}      # Delete memory (soft delete)

POST   /api/v2/search/vector      # Vector similarity search
POST   /api/v2/search/hybrid      # Combined vector + text search
GET    /api/v2/search/suggestions # Search suggestions
GET    /api/v2/search/duplicates  # Find duplicate memories
GET    /api/v2/search/knowledge-graph/{id}  # Build knowledge graph
POST   /api/v2/search/reindex     # Regenerate embeddings

Configuration

# Required
DATABASE_URL=postgresql://secondbrain:changeme@localhost:5432/secondbrain
OPENAI_API_KEY=your-api-key  # Required for embeddings

# Optional
ENABLE_EMBEDDINGS=true  # Enable automatic embedding generation
EMBEDDING_BATCH_SIZE=10
CONNECTION_POOL_SIZE=20

πŸ› Bugs Fixed

  1. Embedding Generation: Fixed disabled embedding generation on memory creation
  2. Vector Format: Fixed PostgreSQL vector format conversion (list to string)
  3. Hybrid Search: Fixed vector_weight parameter not being passed correctly
  4. SQL Functions: Added missing track_memory_access function
  5. Type Mismatches: Fixed FLOAT8 vs REAL type issues in hybrid_search function

πŸ§ͺ Testing

Comprehensive Test Coverage

  • Created test_v42_e2e.py for full end-to-end testing
  • Created test_postgres_v42.py for database integration testing
  • Created test_api_v42.py for API endpoint testing
  • Created benchmark_v42.py for performance validation

Test Results

  • βœ… All PostgreSQL tests passing
  • βœ… All API endpoints tested and working
  • βœ… Performance benchmarks exceed targets
  • βœ… CI/CD pipeline finally passing (2/2 workflows)

πŸ”§ CI/CD Improvements

Before

  • 8 complex workflows, 0 passing
  • Overly complicated tiered testing
  • Constant failures and frustration

After

  • 2 simple workflows, both passing
  • Fast execution (~1-2 minutes)
  • Reliable and maintainable
  • Status badges added to README

πŸ“ Breaking Changes

Since this is single-user development software:

  • No migration scripts provided
  • No backward compatibility maintained
  • Breaking changes are acceptable
  • Focus on moving forward, not preserving the past

πŸŽ‰ Summary

v4.2.0 is a solid, production-ready release that delivers on its promises:

  • βœ… Automatic embeddings work perfectly
  • βœ… Vector search is blazing fast
  • βœ… All tests pass
  • βœ… CI/CD finally works
  • βœ… Clean, maintainable codebase

This release represents a significant step forward in making Second Brain a reliable, high-performance memory layer for AI applications.

πŸ™ Acknowledgments

Thank you for your patience as we simplified and improved the codebase. The focus on single-user development has allowed us to move faster and deliver a better product.


Note: For installation and setup instructions, see the main README.md file.