diff --git a/gallery/index.yaml b/gallery/index.yaml index 9789b820e234..b3c64613c96b 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -23109,3 +23109,64 @@ - filename: Qwen3-Nemotron-32B-RLBFF.i1-Q4_K_M.gguf sha256: 000e8c65299fc232d1a832f1cae831ceaa16425eccfb7d01702d73e8bd3eafee uri: huggingface://mradermacher/Qwen3-Nemotron-32B-RLBFF-i1-GGUF/Qwen3-Nemotron-32B-RLBFF.i1-Q4_K_M.gguf +- !!merge <<: *sentencentransformers + name: "fin-bge-v1" + urls: + - https://huggingface.co/mradermacher/fin-bge-v1-GGUF + description: | + **Model Name:** Finance Embeddings BGE v1 + **Repository:** [Shubham-Mehrotra-PML/fin-bge-v1](https://huggingface.co/Shubham-Mehrotra-PML/fin-bge-v1) + **Base Model:** BAAI/bge-base-en-v1.5 + **License:** Same as base model (Apache 2.0) + + --- + + ### 🏦 **Description** + Fine-tuned BGE model specialized for financial domain embeddings. Trained on a comprehensive dataset of financial terms, ratios, and concepts to deliver high-precision semantic similarity for finance-related NLP tasks. + + ### ✨ **Key Features** + - **Domain-Specific Performance**: Significantly improved accuracy on financial terms (e.g., *PE Ratio ↔ P/E*, *Stock ↔ Equity*). + - **Multi-Objective Training**: Trained with regression, triplet, context, and definition losses for robust embeddings. + - **Normalized Embeddings**: Uses L2 normalization for optimal cosine similarity. + - **High Precision**: Reduces false positives for unrelated/non-finance terms. + + ### 📊 **Performance Highlights** + - **+45.9% improvement** on *Stock ↔ Equity* similarity + - **+51.2% improvement** on *PE Ratio ↔ Valuation* + - Maintains strong performance on general text while enhancing domain specificity + + ### 🧠 **Architecture** + - BERT-based encoder (BGE) + - Hidden size: 768 | Layers: 12 | Attention heads: 12 + - Vocabulary size: 30,522 | Max sequence length: 512 + - Parameters: ~109.5M + + ### 📌 **Use Cases** + - Financial document retrieval + - Semantic search in stock reports, earnings calls, and financial news + - Matching financial terms and concepts + - Building finance-specific QA or recommendation systems + + ### ⚠️ **Limitations** + - Optimized for finance; may underperform on general domain tasks + - Primarily trained on English financial terminology + - Context length capped at 256 tokens for best results + + --- + + **Citation:** + ```bibtex + @misc{finance-embeddings-bge-v1, + title={Finance Embeddings BGE v1: Specialized Financial Domain Embeddings}, + author={Finance Embeddings Team}, + year={2025}, + url={https://huggingface.co/models/fin-bge-v1} + } + ``` + overrides: + parameters: + model: fin-bge-v1.Q8_0.gguf + files: + - filename: fin-bge-v1.Q8_0.gguf + sha256: 30d5292edbefaaad6df42c2b2b22c1e76355a34ea7fd3899df2f287eaf31e9d3 + uri: huggingface://mradermacher/fin-bge-v1-GGUF/fin-bge-v1.Q8_0.gguf