A Python-based autonomous agent system that decomposes complex tasks into subtasks, executes them through mock agents, and provides real-time monitoring via React dashboard.
✅ Plans: Decomposes complex tasks into executable subtasks via PlanningAgent
in agents/planning_agent.py:31
✅ Remembers: Persists state across runs using SQLite backend in state/persistence.py:16
and in-memory context in state/state_manager.py:22
✅ Recovers: Exponential backoff retry logic in services/retry_handler.py:27
with configurable global retry limits
flowchart LR
UI[React Frontend] --> API[FastAPI Server/run_server.py]
API --> ORCH[TaskOrchestrator/orchestrator.py]
ORCH --> PLAN[PlanningAgent/agents/planning_agent.py]
ORCH --> EXEC[ExecutionAgent/agents/execution_agent.py]
EXEC --> MOCK[MockExecutionAgent/agents/mock_execution_agent.py]
ORCH --> VERIFY[VerifierAgent/agents/verifier_agent.py]
ORCH <---> STATE[(StateManager/state/state_manager.py)]
STATE <---> PERSIST[(SQLite DB/state/persistence.py)]
ORCH --> QUEUE[(SubtaskQueue/state/subtask_queue.py)]
ORCH --> RETRY[RetryHandler/services/retry_handler.py]
API --> SSE[SSE Broadcaster/services/sse_broadcaster.py]
Key Components:
- State Manager: SQLite persistence (
mrasr_state.db
) + in-memory context with session tracking - Retry Handler: Exponential backoff (base 1.0s, max 300s) with ±10% jitter and global retry limits
- Mock Agents: Demo-ready execution/verification agents using mock responses or Ollama LLM
- Task Orchestrator: Coordinates execution pipeline with state updates and recovery
# Start everything with Docker (recommended)
# On Windows: Use Command Prompt or PowerShell, not Git Bash
make dev # Linux/macOS
# OR for all platforms:
python safe_startup.py
# Start with local processes (if Docker unavailable)
python safe_startup.py --local
That's it! The script will:
- ✅ Check port availability (3000, 8000)
- ✅ Start Docker containers or local processes
- ✅ Wait for services to be ready
- ✅ Open browser to http://localhost:3000
Verify your setup: python scripts/verify_setup.py
- Checks all dependencies and configuration
Option A: Docker (Recommended)
- Docker Desktop or Docker Engine
- Docker Compose
Option B: Local Development
- Python 3.9+
- Node.js 16+
- Optional: Ollama for LLM features (defaults to mock mode)
# Docker approach (Linux/macOS or with make installed)
make up # Start services
make down # Stop services
make logs # View logs
make restart # Restart all
# Docker approach (Windows without make)
docker compose up -d # Start services
docker compose down # Stop services
docker compose logs -f # View logs
# Local approach
python -m venv venv
venv\Scripts\activate # Windows
source venv/bin/activate # macOS/Linux
pip install -r requirements.txt
cd frontend && npm install
Backend health: GET /health
returns {"status":"healthy","service":"mrasr-api"}
Frontend: Queue Status should show connection indicator (green=SSE active, red=polling)
- Submit Task: Enter title/description in the Task Submission panel
- View Plan: Generated subtasks appear in Subtask Progress panel
- Monitor Progress: Queue Status tiles show pending/in-progress/completed/failed counts
- Manual Execution: Use "Run Next Subtask" button to process tasks step-by-step
- Recovery Demo: "Start Recovery Demo" creates a task that intentionally fails and retries
- Download Results: "Download Result" button generates comprehensive execution artifacts
# Create a task
curl -X POST http://localhost:8000/api/v1/submit_task \
-H "Content-Type: application/json" \
-d '{"title":"Research X and produce summary","description":"Multi-step research task"}'
# Get task status + plan + steps
curl http://localhost:8000/api/v1/task_status/<TASK_ID>
# Get queue summary
curl http://localhost:8000/api/v1/queue/summary
# Stream real-time updates
curl http://localhost:8000/api/v1/stream/queue
Artifacts/Output: Task execution logs stored in SQLite mrasr_state.db
. State persists across server restarts.
State location: SQLite database file mrasr_state.db
in project root, no retention policy configured.
✅ STATUS: Core functionality verified, partial adaptation support
- Plans: Task decomposition working ✅
- Remembers: SQLite state persistence working ✅
- Recovers: Exponential backoff retry logic working ✅
- Adapts: Basic pause/resume working
⚠️ (dynamic re-planning not implemented)
# Submit a multi-part task
curl -X POST http://localhost:8000/api/v1/submit_task \
-H "Content-Type: application/json" \
-d '{"title":"Multi-step Research","description":"Research topic X, analyze results, create summary report"}'
# Expected: Response shows subtasks array with individual steps
# Check: GET /api/v1/task_status/<task_id> shows execution_order array
# Run task, stop server with Ctrl+C, restart with python run_server.py
# Expected: Queue status maintains previous task states
# Check: Queue summary shows non-zero counts from previous session
# Pause task mid-execution
curl -X POST http://localhost:8000/api/v1/tasks/<TASK_ID>/pause
# Resume task
curl -X POST http://localhost:8000/api/v1/tasks/<TASK_ID>/resume
# Expected: Task subtasks change status to BLOCKED/PENDING
# Note: Dynamic re-planning during execution not implemented
# Start recovery demo
curl -X POST http://localhost:8000/api/v1/demo/recovery
# Use "Run Next Subtask" repeatedly - step 2 will fail once then succeed
# Expected: Retry count increments, exponential backoff delay applied
# Check: Queue status shows failed->pending status transitions
404s on frontend: Wrong API base URL in frontend/src/api.js:3
- should be http://localhost:8000/api/v1
Queue Status red/polling: API server down or CORS issue. Check backend logs and curl http://localhost:8000/health
"Failed to refresh data" toast: Frontend polling failed. Check browser console for network errors.
CORS errors in browser:
- Backend allows
http://localhost:3000
by default - For custom frontend ports, update CORS origins in
run_server.py:17
- Restart backend after CORS changes
Port 8000 (backend) already in use:
# Windows
netstat -ano | findstr :8000
taskkill /PID <PID> /F
# Linux/macOS
lsof -ti:8000 | xargs kill -9
# Or change port in run_server.py:57
Port 3000 (frontend) already in use:
# Set custom port
set PORT=3001 && npm start # Windows
PORT=3001 npm start # Linux/macOS
# Or edit frontend/package.json scripts
Unicode/Emoji errors in startup script:
# Use Command Prompt or PowerShell, not Git Bash
# Or set encoding:
set PYTHONIOENCODING=utf-8
python safe_startup.py --local
Path issues with virtual environment:
# Use full paths on Windows if activation fails
C:\path\to\venv\Scripts\activate
# Or use Python directly:
C:\path\to\venv\Scripts\python.exe run_server.py
Docker on Windows:
- Ensure Docker Desktop is running
- Enable WSL2 backend if available
- Use PowerShell as Administrator if permission issues
Real-time updates not working:
- Check if browser blocks SSE connections
- Corporate firewalls may block streaming endpoints
- Fallback: Frontend uses polling mode (red indicator)
- Test:
curl http://localhost:8000/api/v1/stream/queue
Missing dependencies:
# Python dependencies
pip install -r requirements.txt
# Frontend dependencies
cd frontend && npm install
# Virtual environment not activated
# Windows:
venv\Scripts\activate
# Linux/macOS:
source venv/bin/activate
Ollama connectivity:
- Test with
python -c "import ollama; print('OK')"
- System gracefully falls back to mock mode if unavailable
- Set
USE_MOCK_LLM=true
in .env to force mock mode
Variable | Required | Default | Purpose |
---|---|---|---|
MODEL_NAME |
No | phi3:latest |
Ollama model for LLM tasks |
OLLAMA_BASE_URL |
No | http://localhost:11434 |
Ollama server URL |
USE_MOCK_LLM |
No | false |
Force mock responses |
MAX_GLOBAL_RETRIES |
No | 10 |
Global retry limit per subtask |
RETRY_BASE_DELAY |
No | 1.0 |
Base retry delay in seconds |
RETRY_MAX_DELAY |
No | 300.0 |
Maximum retry delay |
LLM_TIMEOUT_SECONDS |
No | 30 |
LLM request timeout |
Endpoint | Method | Body | Response | Notes |
---|---|---|---|---|
/health |
GET | - | {"status":"healthy"} |
Health check |
/api/v1/submit_task |
POST | {"title":"X","description":"Y"} |
Task ID + subtasks | Creates and queues task |
/api/v1/task_status/{id} |
GET | - | Task status + subtasks | Full task details |
/api/v1/queue/summary |
GET | - | {"pending":N,"in_progress":N,...} |
Queue counts |
/api/v1/run_next_subtask |
POST | - | Execution result | Manual task processing |
/api/v1/tasks/{id}/pause |
POST | - | Success message | Pause task |
/api/v1/tasks/{id}/resume |
POST | - | Success message | Resume task |
/api/v1/tasks/{id}/cancel |
POST | - | Success message | Cancel task |
/api/v1/demo/recovery |
POST | - | Demo task ID | Recovery demo |
/api/v1/stream/queue |
GET | - | SSE event stream | Real-time updates |
/api/v1/tasks/{id}/generate_artifact |
POST | - | Artifact metadata | Generate execution report |
/api/v1/artifacts/{id} |
GET | - | JSON file download | Download task artifact |
/api/v1/artifacts |
GET | - | Artifacts list | List all available artifacts |
MRASR automatically generates comprehensive execution reports when tasks complete. These JSON artifacts contain:
- Task Overview: Title, description, overall status, completion metrics
- Execution Summary: Success rates, retry statistics, timing analysis
- Subtask Details: Individual execution results, attempt history, timestamps
- Retry Analysis: Failure patterns, backoff progression, recovery metrics
- Performance Metrics: Throughput, reliability scores, recovery times
- System Information: Session data, version info, generation timestamp
{
"artifact_metadata": {
"task_id": "550e8400-e29b-41d4-a716-446655440000",
"generated_at": "2025-08-21T15:30:45.123Z",
"generator": "MRASR Artifacts Service v1.0"
},
"execution_summary": {
"total_subtasks": 4,
"success_rate": 100.0,
"total_retries": 2,
"total_execution_time": 145.67
},
"retry_analysis": {
"total_retrying_subtasks": 1,
"retry_distribution": { "2": 1 },
"successful_recoveries": 1
}
}
Download: Click "Download Result" button when tasks complete, or use API endpoint /api/v1/artifacts/{task_id}
60-Second Demonstration: Watch deterministic failures, exponential backoff, and automatic recovery in action.
- Start services with
make dev
and open http://localhost:3000 - Click "Start Recovery Demo" - creates flaky task with predetermined failure
- Execute subtasks sequentially - first succeeds, second fails twice then recovers
- Watch retry behavior - exponential backoff (1s → 2s) with live countdown
- See successful recovery - third attempt succeeds, generates execution artifact
- Download results - comprehensive JSON report with retry analytics
- Deterministic failures: Second subtask always fails exactly twice
- Exponential backoff: Delays increase from 1s to 2s to 4s
- Real-time feedback: Live countdown timers and attempt tracking
- State persistence: All execution data saved to SQLite
- Comprehensive artifacts: Downloadable execution reports
Tests: python -m pytest tests/
(unit tests), python test_mrasr_complete.py
(integration)
Debug logging: Set DEBUG_MODE=true
in .env
Repo layout:
agents/
- Task execution agents (planning, execution, verification, recovery)api/
- FastAPI router and endpointsservices/
- Background processing, SSE, retry logicstate/
- State management and SQLite persistenceschemas/
- Pydantic models and core typesfrontend/src/
- React dashboard components
Built by David Shableski - Senior CS/Math major focused on agentic workflows and LLM evaluation
MIT License (see LICENSE file)