rag-evaluation

Here are 39 public repositories matching this topic...

Giskard-AI / giskard-oss

🐢 Open-Source Evaluation & Testing library for LLM Agents

ai-security mlops fairness-ai responsible-ai ml-validation red-team-tools trustworthy-ai ml-testing llm ai-red-team ai-testing llmops llm-security llm-eval llm-evaluation rag-evaluation agent-evaluation

Updated Nov 18, 2025
Python

Marker-Inc-Korea / AutoRAG

Star

AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

python open-source qa benchmarking ops pipeline analysis optimization evaluation embeddings automl document-parser rag llm retrieval-augmented-generation llm-ops llm-evaluation rag-evaluation

Updated Nov 16, 2025
Python

Agenta-AI / agenta

Star

The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.

prompt-engineering prompt-management llm-tools llm-framework llm-playground llm-platform llm-evaluation rag-evaluation llm-monitoring llm-as-a-judge llm-observability llmops-platform

Updated Nov 19, 2025
Python

frutik / Awesome-RAG

Star

rag rag-implementation rag-evaluation

Updated Sep 7, 2025

vectara / open-rag-eval

Star

RAG evaluation without the need for "golden answers"

metrics evaluation-metrics rag vectara retrieval-augmented-generation rag-evaluation

Updated Nov 18, 2025
Python

LLAMATOR-Core / llamator

Star

Framework for testing vulnerabilities of large language models (LLM).

Updated Sep 24, 2025
Python

mburaksayici / RAG-Boilerplate

Star

RAG boilerplate with semantic/propositional chunking, hybrid search (BM25 + dense), LLM reranking, query enhancement agents, CrewAI orchestration, Qdrant vector search, Redis/Mongo sessioning, Celery ingestion pipeline, Gradio UI, and an evaluation suite (Hit-Rate, MRR, hybrid configs).

ai-agents reranking rag vector-database hybrid-search qdrant llm retrieval-augmented-generation rag-evaluation semantic-chunking crewai rag-pipeline propositional-models query-enhancement

Updated Nov 18, 2025
Python

mts-ai / rurage

Star

information-retrieval question-answering rag llm-evaluation rag-evaluation

Updated Apr 14, 2025
Python

vero-labs-ai / vero-eval

Star

Open source framework for evaluating AI Agents

python testing evaluation datasets dataset-generation evaluation-metrics evaluation-framework testing-framework testing-library synthetic-dataset-generation user-persona evals llm-evaluation rag-evaluation llm-evaluation-framework langgraph rag-testing

Updated Nov 14, 2025
Python

oztrkoguz / RAG-Framework-Evaluation

Star

This project aims to compare different Retrieval-Augmented Generation (RAG) frameworks in terms of speed and performance.

swarms autogen rag langchain llamaindex rag-evaluation crewai langchain-rag autogen-rag crewai-rag llamaindex-rag swarms-rag

Updated Jul 28, 2024
Python

ioannis-papadimitriou / rag-playground

Star

A framework for systematic evaluation of retrieval strategies and prompt engineering in RAG systems, featuring an interactive chat interface for document analysis.

chatbot qa-generation llm-inference retrieval-augmented-generation rag-evaluation