AI Response Validator - Automated accuracy checking, hallucination prevention, and confidence scoring for AI responses.
AI Validator helps you ensure the quality and reliability of AI-generated responses by:
- β Automated Accuracy Checking - Verify AI responses against source documents
- β Hallucination Prevention - Detect when AI invents information not in sources
- β Confidence Scoring - Get reliability scores for every response
- β Query Classification - Skip validation for greetings, typos, and small talk
- β Multi-LLM Support - Works with OpenAI and Claude
Perfect for RAG systems, knowledge bases, and any application where AI response quality matters.
npm install @vezlo/ai-validatorOr install globally for CLI access:
npm install -g @vezlo/ai-validator# Clone the repository
git clone https://github.com/vezlo/ai-validator.git
cd ai-validator
# Install dependencies
npm install
# Build the project
npm run build
# Run the test CLI
npm testTest the validator interactively without writing code:
# Using npx (no installation required)
npx vezlo-validator-test
# Or if installed globally
vezlo-validator-testThe CLI will guide you through:
- Selecting LLM provider (OpenAI or Claude)
- Entering API keys
- Choosing models (any OpenAI or Claude model)
- Configuring validation settings
- Testing with your own queries and responses
- Easy text input for sources (no JSON required)
import { AIValidator } from '@vezlo/ai-validator';
// Initialize with your API key and provider
const validator = new AIValidator({
openaiApiKey: 'sk-your-openai-key', // Your OpenAI API key
llmProvider: 'openai' // 'openai' or 'claude'
});
// Validate a response
const validation = await validator.validate({
query: "What is machine learning?",
response: "Machine learning is a subset of AI that focuses on algorithms.",
sources: [
{
content: "Machine learning is a subset of artificial intelligence that focuses on algorithms and statistical models.",
title: "ML Guide",
url: "https://example.com/ml-guide"
}
]
});
// Check results
console.log(`Confidence: ${(validation.confidence * 100).toFixed(1)}%`);
console.log(`Valid: ${validation.valid}`);
console.log(`Accuracy: ${validation.accuracy.verified ? 'Verified' : 'Not verified'}`);
console.log(`Hallucination Risk: ${(validation.hallucination.risk * 100).toFixed(1)}%`);
console.log(`Warnings: ${validation.warnings.join(', ')}`);import { AIValidator } from '@vezlo/ai-validator';
const validator = new AIValidator({
// API Keys (at least one required)
openaiApiKey: 'sk-your-openai-key',
claudeApiKey: 'sk-ant-your-claude-key',
// LLM Provider (required)
llmProvider: 'openai', // 'openai' or 'claude'
// Model Selection (optional - you can specify any model from the provider)
openaiModel: 'gpt-4o', // Any OpenAI model: gpt-4o, gpt-4o-mini, gpt-4, etc.
claudeModel: 'claude-sonnet-4-5-20250929', // Any Claude model
// Validation Settings (optional)
confidenceThreshold: 0.7, // 0.0 - 1.0 (default: 0.7)
enableQueryClassification: true, // Skip validation for greetings/typos
enableAccuracyCheck: true, // LLM-based accuracy checking
enableHallucinationDetection: true // LLM-based hallucination detection
});// Example with a RAG system
const ragResponse = await yourRAGSystem.query(userQuestion);
const sources = await yourRAGSystem.getSources(userQuestion);
const validation = await validator.validate({
query: userQuestion,
response: ragResponse.content,
sources: sources.map(s => ({
content: s.text,
title: s.title,
url: s.url
}))
});
if (validation.valid) {
// Show response to user
return ragResponse.content;
} else {
// Handle low confidence response
console.warn('Low confidence response:', validation.warnings);
return "I'm not confident about this answer. Please consult additional sources.";
}interface ValidationResult {
confidence: number; // 0.0 - 1.0
valid: boolean; // true if confidence >= threshold
accuracy: {
verified: boolean;
verification_rate: number;
reason?: string;
};
context: {
source_relevance: number;
source_usage_rate: number;
valid: boolean;
};
hallucination: {
detected: boolean;
risk: number;
hallucinated_parts?: string[];
};
warnings: string[];
query_type?: string; // 'greeting', 'question', etc.
skip_validation?: boolean; // true for greetings/typos
}All configuration is done in code when initializing the validator:
interface AIValidatorConfig {
// API Keys (at least one required)
openaiApiKey?: string; // Your OpenAI API key
claudeApiKey?: string; // Your Claude API key
// Provider (required)
llmProvider: 'openai' | 'claude';
// Models (optional - specify any valid model from the chosen provider)
openaiModel?: string; // Default: 'gpt-4o'
claudeModel?: string; // Default: 'claude-sonnet-4-5-20250929'
// Validation Settings (optional)
confidenceThreshold?: number; // Default: 0.7
enableQueryClassification?: boolean; // Default: true
enableAccuracyCheck?: boolean; // Default: true
enableHallucinationDetection?: boolean; // Default: true
}OpenAI Models:
You can use any OpenAI chat model by specifying it in openaiModel. Common choices include:
gpt-4o(default, recommended)gpt-4o-mini(faster, cheaper)gpt-4(previous flagship)gpt-4-turbo- Or any other OpenAI chat completion model
Claude Models:
You can use any Claude model by specifying it in claudeModel. Common choices include:
claude-sonnet-4-5-20250929(default, Claude 4.5 Sonnet)claude-opus-4-1-20250805(Claude 4.1 Opus)claude-3-7-sonnet-20250219(Claude 3.7 Sonnet)- Or any other Claude model identifier
The validator will work with any model supported by the respective provider's API.
# Interactive testing CLI
npx vezlo-validator-test
# Development commands
npm run build # Build the project
npm run clean # Clean build files
npm test # Run the test CLIValidate responses against retrieved documents to ensure accuracy.
Prevent incorrect information from reaching customers.
Ensure AI answers are grounded in your documentation.
Validate AI-generated content against source materials.
Ensure AI tutoring responses are accurate and helpful.
- Validation Time: 2-5 seconds per response (depending on LLM provider)
- Cost: Additional LLM API calls for validation
- Accuracy: High accuracy for responses with good sources
- Reliability: Graceful handling of edge cases
- Query Classification - Identifies greetings, typos, and small talk (skips validation)
- Accuracy Checking - Uses LLM to verify facts against source documents
- Hallucination Detection - Identifies information not present in sources
- Context Validation - Ensures response relevance to the query
- Confidence Scoring - Combines all metrics into a single score
{
confidence: 0.92,
valid: true,
accuracy: { verified: true, verification_rate: 0.95 },
hallucination: { detected: false, risk: 0.05 },
warnings: []
}{
confidence: 0.35,
valid: false,
accuracy: { verified: false, verification_rate: 0.2 },
hallucination: { detected: true, risk: 0.8 },
warnings: ["No sources provided - high hallucination risk"]
}{
confidence: 1.0,
valid: true,
query_type: "greeting",
skip_validation: true,
warnings: []
}Contributions are welcome! Please feel free to submit a Pull Request.
This project is dual-licensed:
- Non-Commercial Use: Free under AGPL-3.0 license
- Commercial Use: Requires a commercial license - contact us for details
See the LICENSE file for complete AGPL-3.0 license terms.
- Issues: GitHub Issues
- Documentation: GitHub Wiki
- Discussions: GitHub Discussions
- @vezlo/assistant-server - AI Assistant Server with RAG capabilities
- @vezlo/src-to-kb - Convert source code to knowledge base
Status: β Production Ready | Version: 1.0.2 | License: AGPL-3.0 | Node.js: 20+
Made with β€οΈ by Vezlo