Deepr is an agentic research platform that uses AI models to conduct deep research, build domain experts, and synthesize knowledge.
graph TB
subgraph Interfaces
CLI["CLI (Click)"]
Web["Web Dashboard (React + Flask)"]
MCP["MCP Server (AI Agent Tools)"]
end
subgraph Core
Router["Auto Mode Router<br/><i>complexity analysis, cost optimization</i>"]
Research["Research Engine<br/><i>multi-phase, context chaining</i>"]
Experts["Expert System<br/><i>beliefs, memory, autonomous learning</i>"]
Context["Context Discovery<br/><i>semantic search, temporal tracking</i>"]
end
subgraph Providers
OpenAI["OpenAI<br/>o3 / o4-mini deep research, GPT-5.2, GPT-4.1"]
Gemini["Gemini<br/>Deep Research Agent, 3 Pro, 2.5 Flash"]
Grok["Grok<br/>4 Fast"]
Anthropic["Anthropic<br/>Claude Opus 4.6 / Sonnet / Haiku 4.5"]
AzureFoundry["Azure AI Foundry<br/>o3 deep research, GPT-5, GPT-4.1 + Bing"]
end
subgraph Infrastructure
Queue["Job Queue (SQLite)"]
Storage["Storage (Local / S3 / Blob / GCS)"]
Observe["Observability<br/><i>costs, traces, quality metrics</i>"]
Budget["Budget Controls<br/><i>per-job, daily, monthly limits</i>"]
end
CLI --> Router
Web --> Router
MCP --> Router
Router --> Research
Router --> Experts
Research --> Context
Research --> OpenAI
Research --> Gemini
Research --> Grok
Research --> Anthropic
Research --> AzureFoundry
Experts --> OpenAI
Experts --> Gemini
Experts --> Grok
Experts --> Anthropic
Experts --> AzureFoundry
Research --> Queue
Research --> Storage
Experts --> Storage
Context --> Storage
Budget -.->|"guards"| Research
Budget -.->|"guards"| Experts
Observe -.->|"tracks"| Research
Observe -.->|"tracks"| Providers
-
Local-first with SQLite, not Postgres. Research results, expert profiles, job queues, and cost tracking all use SQLite. No database server to run, no connection strings to manage. Users
pip installand go. Cloud deployment swaps in DynamoDB/CosmosDB/Firestore via storage abstractions, but the local experience stays zero-config. -
Experts are not just RAG. Most "chat with your docs" tools do retrieval then generation and stop there. Deepr experts have a metacognition layer — they track what they know (claims with confidence), recognize what they don't know (gaps with priority), and (in agentic mode) autonomously research to fill those gaps. The knowledge persists permanently, so the expert improves over time rather than resetting each session.
-
Auto-mode routing analyzes query complexity before choosing a model. Simple factual questions go to grok-4-1-fast-non-reasoning at $0.01. Complex multi-faceted research goes to o3-deep-research at $0.50. This isn't just keyword matching — it uses a lightweight classifier to estimate complexity, then factors in which API keys are configured, current budget, and provider health scores. Batch processing 20 queries this way costs $1-2 instead of $20-40.
-
Multi-layer budget controls because research costs real money. Per-operation limits, daily caps, monthly ceilings, pre-submission estimates, and a circuit breaker that pauses after repeated failures. The system saves progress on pause so you can resume later. An uncapped loop calling o3-deep-research could burn $100+ before you notice.
-
Provider abstraction with circuit breakers, not just try/catch. Each provider has health scoring with exponential decay, latency percentile tracking (p50/p95/p99), and automatic disabling after sustained failures. The router uses exploration/exploitation (10% exploration by default) to discover when a degraded provider recovers.
- Location:
deepr/research_agent/ - Purpose: Conducts multi-step research using AI models
- Modes (via
ResearchModeenum incore/settings.py):READ_ONLY: Browse existing knowledge onlySTANDARD: Standard research with web search (~$0.25)EXTENDED: Deep research with multi-step analysis (~$2.00)UNRESTRICTED: Full autonomous research
- Location:
deepr/experts/ - Purpose: Creates domain experts that learn and answer questions
- Components:
profile.py: Expert metadata, usage tracking, provider configcurriculum.py: Generates learning planslearner.py: Autonomous learning executionchat.py: Interactive Q&A with expertsrouter.py: Routes queries to appropriate modelsbeliefs.py: Belief formation with confidence levelsmetacognition.py: Gap awareness and self-assessmentmemory.py: Conversation and knowledge memorysynthesis.py: Knowledge synthesis from documentstemporal_knowledge.py: Time-aware knowledge managementcost_safety.py: Budget controls and spending limits
- Location:
deepr/providers/ - Purpose: Unified interface to AI providers
- Providers:
- OpenAI (o3/o4-mini deep research, GPT-5.2, GPT-5, GPT-4.1, GPT-4.1-mini)
- Azure OpenAI (same models, Azure-hosted)
- Azure AI Foundry (o3 deep research + Bing, GPT-5, GPT-4.1)
- xAI (Grok 4 Fast)
- Google (Gemini 3.1 Pro, 3 Flash, 2.5 Flash, Deep Research Agent)
- Anthropic (Claude Opus 4.6, Sonnet 4.5, Haiku 4.5)
- Location:
deepr/providers/registry.py - Purpose: Single source of truth for model capabilities
- Contains:
- Model costs
- Latency estimates
- Context windows
- Specializations (reasoning, speed, cost, etc.)
CRITICAL: When new models are released (GPT-5.3, Grok 5, etc.), update ONLY the registry. Never hardcode model names elsewhere.
- Location:
deepr/queue/ - Purpose: Manages research job execution
- Supports:
- Local queue (SQLite)
- Azure Queue Storage (production)
- Location:
deepr/storage/ - Purpose: Stores research results and expert knowledge
- Supports:
- Local filesystem
- Azure Blob Storage (production)
User Query
|
Research Planner (generates plan)
|
Queue System (schedules jobs)
|
Research Agent (executes with AI model)
|
Storage System (saves results)
|
User receives report
Create Expert
|
Curriculum Generator (plans learning topics)
|
Research Agent (learns each topic)
|
Vector Store (stores knowledge)
|
Expert ready to answer questions
CRITICAL: All models are defined in deepr/providers/registry.py. This is the SINGLE SOURCE OF TRUTH. When GPT-5.3 or Grok 5 are released, update ONLY the registry. Never hardcode model names.
- GPT-5.2 (OpenAI): $0.25, 2s, best for planning/curriculum
- o3-deep-research (OpenAI): $0.50, 2-5min, comprehensive multi-step research
- o4-mini-deep-research (OpenAI): $2.00, 60s, deep research with extended reasoning
- Grok 4 Fast (xAI): $0.01, 1s, best for quick lookups
- Gemini 3.1 Pro (Google): $0.20, ~40s, 1M context for large docs (2x pricing >200K tokens)
- Gemini Deep Research (Google): $1.00, 5-20min, async research with Google Search
- Gemini 2.5 Flash (Google): $0.002, 1.5s, cheapest option for general queries
- Claude Opus 4.6 (Anthropic): $0.80, 15s, best for complex reasoning with Adaptive Thinking
- Claude Sonnet 4.5 (Anthropic): $0.48, 3s, balanced quality/cost for coding
- Claude Haiku 4.5 (Anthropic): $0.05, 1.5s, fast and cheap for simple queries
Models are selected based on:
- Task complexity: Simple vs complex reasoning
- Budget: Cost constraints
- Speed: Latency requirements
- Context size: Amount of information to process
See deepr/providers/registry.py for full model capabilities.
Configuration is managed through:
deepr/config.py: Main configuration.env: Environment variables (API keys, etc.)deepr/config/: Provider-specific configs
- Single Source of Truth: Model registry for all model info
- Provider Abstraction: Unified interface across providers
- Async by Default: All I/O operations are async
- Cost Tracking: Every operation tracks costs
- Stateless: Research jobs can be resumed/retried
deepr/
├── api/ # REST API (Flask)
├── cli/ # Command-line interface (Click)
│ └── commands/
│ └── semantic/ # research, artifacts, experts modules
├── config/ # Configuration management
├── core/ # Core business logic
├── experts/ # Expert system (beliefs, memory, learning)
├── formatting/ # Output formatting utilities
├── mcp/ # Model Context Protocol server
├── observability/ # Cost tracking, provider routing, quality metrics
├── providers/ # AI provider integrations
├── queue/ # Job queue system
├── research_agent/ # Research execution
├── routing/ # Auto mode query routing
├── services/ # Business logic services
├── storage/ # Data persistence
├── templates/ # Prompt templates
├── tools/ # Utility tools (web search, etc.)
├── utils/ # General utilities (scraping, etc.)
├── web/ # Web interface
├── webhooks/ # Webhook handlers
└── worker/ # Background job processing
To add new capabilities:
- New AI Provider: Implement
BaseProviderindeepr/providers/ - New Model: Add to
MODEL_CAPABILITIESinregistry.py - New Research Mode: Extend
ResearchModeenum - New Storage Backend: Implement
BaseStorageinterface
- Caching: Prompt caching reduces costs by 90%
- Parallel Execution: Multiple research jobs run concurrently
- Model Selection: Router picks cheapest model that meets requirements
- Context Management: Automatic context window management
Deepr handles sensitive data (API keys, research content, expert knowledge) and makes external API calls. This section documents security considerations and mitigations.
- API Keys - Provider credentials (OpenAI, xAI, Google, Anthropic)
- Research Content - User queries and research results
- Expert Knowledge - Synthesized beliefs and documents
- Cost/Budget - Prevent unauthorized spending
| Threat | Risk | Mitigation |
|---|---|---|
| API key exposure | High | Environment variables only, never in code/logs |
| Path traversal | Medium | Input validation, sandboxed file operations |
| Prompt injection | Medium | System prompts are not user-modifiable |
| Cost runaway | Medium | Session budgets, daily limits, circuit breakers |
| Data exfiltration | Low | Local storage by default, no external telemetry |
- Keys loaded from environment variables only
- Never logged, even at DEBUG level
- Not included in error messages
- Validated on startup (fail fast)
# Good
api_key = os.getenv("OPENAI_API_KEY")
# Bad - never do this
api_key = "sk-..." # Hardcoded
logger.debug(f"Using key: {api_key}") # LoggedAll file operations validate paths:
# deepr/storage/local.py
def _validate_path(self, path: Path) -> bool:
"""Ensure path is within allowed directory."""
resolved = path.resolve()
return resolved.is_relative_to(self.base_dir)User-provided paths are:
- Resolved to absolute paths
- Checked against allowed directories
- Rejected if they escape the sandbox
User inputs are validated before use:
- Expert names: Alphanumeric + hyphens only
- File paths: Must be within workspace
- Queries: Length limits, no control characters
- Budget values: Positive numbers within limits
Multiple layers prevent runaway costs. Implementation in deepr/experts/cost_safety.py.
Hard Limits (Cannot Be Overridden):
- Per Operation: $10 maximum
- Per Day: $50 maximum
- Per Month: $500 maximum
Configurable Limits (Defaults):
- Per Operation: $5
- Per Day: $25
- Per Month: $200
Features:
- Session-level cost tracking with alerts at 50%, 80%, 95%
- Circuit breaker for repeated failures (auto-pause after 3 consecutive failures)
- Audit logging of all cost-incurring operations
- Graceful pause/resume for daily/monthly limits
CLI Budget Validation:
- Warns for budgets > $10
- Requires confirmation for budgets > $25
- Shows daily/monthly spending status with
/statuscommand in expert chat
Pause/Resume for Long-Running Processes:
When learning or curriculum execution hits limits:
- Progress saved to
data/experts/<name>/knowledge/learning_progress.json - Clear message about when to resume
- Resume with
deepr expert resume "<name>"
# If daily limit hit during learning:
PAUSED - Daily/Monthly Limit Reached
Progress: 8 topics completed, 7 remaining
To resume:
deepr expert resume "Azure Architect"- API endpoints have request rate limits
- Provider calls respect upstream rate limits
- Exponential backoff on rate limit errors
Security-relevant events are logged:
- API key validation (success/failure)
- Cost threshold alerts
- Research job submissions
- Expert creation/deletion
Logs do NOT contain:
- API keys or tokens
- Full research content
- User credentials
- Use environment variables for all secrets
- Set budget limits appropriate for your use case
- Review logs for unusual activity
- Keep dependencies updated for security patches
- Use HTTPS for web interface in production
- No authentication on local web interface (designed for local use)
- No encryption at rest for local storage
- API keys have full provider access (no scoping)
For production deployments, consider:
- Adding authentication layer
- Encrypting sensitive data at rest
- Using provider-specific API key scoping where available
The deepr/observability/ module provides monitoring and cost management:
- Per-job cost tracking with provider/model breakdown
- Daily, weekly, monthly cost aggregation
- Budget alerts with configurable thresholds
- Atomic persistence to prevent data corruption
- Autonomous provider selection based on cost, latency, and availability
- Fallback handling when providers fail
- Health scoring with exponential decay
- Metrics tracking (success rate, latency, costs)
- Response quality scoring
- Model performance comparison
- Research output evaluation
- Request/response logging
- Debugging support for multi-step workflows
- Cost tracking per job
- Latency metrics per provider
- Error rates and retry logic
- Usage analytics in web dashboard