Skip to content

added ollama use cases examples#26

Open
adrianpuiu wants to merge 101 commits intokayba-ai:mainfrom
adrianpuiu:claude/init-project-setup-01VZqbXKGesDJWTzSb8VNNAn
Open

added ollama use cases examples#26
adrianpuiu wants to merge 101 commits intokayba-ai:mainfrom
adrianpuiu:claude/init-project-setup-01VZqbXKGesDJWTzSb8VNNAn

Conversation

@adrianpuiu
Copy link

  1. 🔍 Code Review Agent (code_review_agent.py)
    Reviews code for security vulnerabilities and best practices:

SQL injection detection
Resource leak identification
Exception handling analysis
Mutable default arguments
XSS vulnerabilities
Race conditions
Learns: Team-specific coding patterns and security standards

  1. 📊 Data Analysis Agent (data_analysis_agent.py)
    Analyzes data and generates actionable insights:

Sales trend analysis
Anomaly detection
User engagement metrics
System performance monitoring
Customer satisfaction patterns
Learns: What types of insights are valuable for different data domains

  1. 🗄️ SQL Query Generator (sql_query_agent.py)
    Translates natural language to SQL queries:

Complex joins and aggregations
Subqueries and window functions
Query optimization patterns
Database-specific syntax
Learns: Schema-specific patterns and business query requirements

  1. 🔧 Troubleshooting Assistant (troubleshooting_agent.py)
    Diagnoses system issues from logs and symptoms:

Memory leaks and resource exhaustion
Network timeouts and connectivity issues
Configuration problems
Performance bottlenecks
Learns: Environment-specific issues and resolution patterns

  1. 📝 Technical Writer Agent (technical_writer_agent.py)
    Converts technical content to clear documentation:

API documentation
README files
Changelog entries
Configuration guides
Tutorial introductions
Learns: Company documentation style and best practices

Development & DevOps (5 agents)
🔍 Code Review Agent - Security vulnerabilities and best practices
🧪 Test Case Generator ⭐ NEW - Comprehensive unit test generation
🗄️ SQL Query Generator - Natural language to SQL
📝 Git Commit Message Generator ⭐ NEW - Conventional commit messages
🔧 Troubleshooting Assistant - System diagnostics
Operations & Support (2 agents)
📧 Email/Ticket Classifier ⭐ NEW - Support ticket automation
🐛 Bug Report Analyzer ⭐ NEW - Issue triage and severity
Data & Analytics (1 agent)
📊 Data Analysis Agent - Data insights and patterns
Security (1 agent)
🔐 Security Log Analyzer ⭐ NEW - Threat detection and response
Documentation (1 agent)
📝 Technical Writer Agent - Code to documentation
📊 What Each New Agent Does

  1. 🧪 Test Case Generator

Learns:

  • Edge cases (empty input, null, boundary values)
  • Pytest patterns (@pytest.mark, fixtures)
  • Mocking strategies (unittest.mock, responses)
  • Team testing conventions

Training: 6 code samples covering various scenarios

  1. 📧 Email/Ticket Classifier

Learns:

  • Priority: critical → low
  • Categories: bug, feature, billing, security
  • Department routing rules
  • False alarm detection (spam)

Training: 7 real support tickets

  1. 🐛 Bug Report Analyzer

Learns:

  • Severity: blocker → trivial
  • Component assignment (ui, api, backend)
  • Required information extraction
  • Duplicate detection patterns

Training: 6 bug reports with various severities

  1. 📝 Git Commit Message Generator

Learns:

  • Conventional commit format (feat/fix/docs/etc)
  • Scope detection from diff
  • Breaking change identification
  • 72-character limit compliance

Training: 6 code diffs with commits

  1. 🔐 Security Log Analyzer

Learns:

  • Threat types: brute force, SQL injection, data exfil
  • Severity: critical → none
  • False positive reduction
  • Attack pattern signatures

Training: 7 security log scenarios

🎯 Quick Start

Run individual agent

uv run python examples/ollama/test_generator_agent.py
uv run python examples/ollama/email_classifier_agent.py
uv run python examples/ollama/bug_report_agent.py
uv run python examples/ollama/commit_message_agent.py
uv run python examples/ollama/security_log_agent.py

Or run all 10 agents sequentially (20-30 minutes)

uv run python examples/ollama/run_all_demos.py

User and others added 30 commits November 5, 2025 03:14
Critical fixes:
- Add configurable retry_prompt parameter to Reflector class (English default)
- Add configurable retry_prompt parameter to Curator class (English default)
- Replace hardcoded Chinese retry prompts with configurable system
- All three roles (Generator, Reflector, Curator) now consistent
- Update .gitignore to exclude checkpoint and evaluation result JSON files

This completes the refactoring started in commit 087e2ed where we fixed
Generator's Chinese prompt. Now all three ACE roles use the same pattern.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Update 4 example files to use prompts_v2_1 instead of deprecated prompts_v2:
- examples/helicone/convex_training.py
- examples/advanced_prompts_v2.py
- examples/helicone/offline_training_replay.py
- examples/browser-use/ace_domain_checker.py

Note: compare_v1_v2_prompts.py and compare_v2_v2_1_prompts.py intentionally
keep prompts_v2 imports since they explicitly compare prompt versions.

Examples now demonstrate current best practices.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Add comprehensive documentation for recent improvements:
- Configurable retry_prompt parameter (Generator, Reflector, Curator)
- Checkpoint saving during training (checkpoint_interval, checkpoint_dir)
- Prompt version guidance (v1.0 simple, v2.0 deprecated, v2.1 recommended)
- Feature detection utilities (ace/features.py)
- Updated test coverage section (mention integration tests)

Also update module structure to reflect:
- prompts_v2.py marked as DEPRECATED
- prompts_v2_1.py marked as RECOMMENDED
- New features.py module

CLAUDE.md now serves as complete developer reference.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Fix Python version requirement in SETUP_GUIDE.md (3.9 → 3.11)
- Fix async test decorator in test_litellm_client.py
- Export DataLoader from benchmarks/loaders for API consistency
- Update examples to use recommended prompts_v2_1 instead of deprecated prompts_v2
- Remove unnecessary sys.path manipulation from all example files

All 79 tests passing. Resolves critical documentation inconsistencies
and improves code quality across examples and test suite.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Simplified README quickstart from 55 lines to ~35 lines
- Added built-in SimpleEnvironment class for easy getting started
- Removed need for custom environment class in quickstart
- Made the quickstart more progressive: basic usage → learning
- Added SimpleEnvironment to ace exports
- Added links to full examples for users who want more

The new quickstart is much more approachable for beginners while
still showing the core value of ACE (learning from examples).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Updates to core ACE components:
- Enhanced delta operations and playbook functionality
- Improved prompts v2.1 with better role implementations
- Updated browser automation examples for domain checking

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
feat: Enhance browser-use demos and core ACE framework
- Added new demo section showcasing ACE vs baseline browser automation
- Includes performance metrics: 30% → 100% success rate, 38.8 → 6.9 avg steps
- Added demo results image with detailed comparison data
- Shows ACE's autonomous learning and optimization capabilities

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Fixes JSON serialization error when Sample objects are passed via kwargs
to LLM completion calls. The 'sample' parameter is used by ReplayGenerator
but cannot be serialized when LiteLLM attempts to log it to Opik tracing.

Changes:
- Generator._generate_impl(): Filter 'sample' from kwargs before llm.complete()
- Reflector._reflect_impl(): Filter 'sample' from kwargs before llm.complete()
- Curator.curate(): Filter 'sample' from kwargs before llm.complete()

This preserves ReplayGenerator functionality while preventing serialization
errors when Opik observability is enabled.

Based on LiteLLM best practices for handling custom metadata in kwargs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Allows manual triggering of tests via GitHub UI or CLI
- Helps diagnose why automatic workflow triggers stopped after Oct 22
- Updates workflow registration with GitHub Actions
- Add proper type casts and annotations in delta.py
- Fix missing Any import in adaptation.py
- Add proper Optional type handling for playbook.py
- Fix None return types in prompts_v2.py and prompts_v2_1.py
- Fix optional dependency type annotations across all modules
- Add TYPE_CHECKING guards for conditional imports
- Fix decorator signature inconsistencies in roles.py
- Resolve dict.get() type issues
- Fix Router type assignments in litellm_client.py

All 46 mypy errors have been addressed.
- Fix no-redef errors by declaring type annotations before assignments
- Add missing List import in prompts_v2_1.py
- Fix Dict[str, Any] type annotation for comparisons dict
- Add proper cast for int() in playbook.py
- Add Optional[Any] type annotation for router in litellm_client.py
- Use type: ignore[assignment] for conditional type assignments

All mypy errors should now be resolved.
- Check if OpikLogger is None before calling constructor
- Add type: ignore[misc] for the instantiation
- Ensures mypy passes with all optional dependency scenarios

mypy now reports: Success: no issues found in 16 source files
Set up automated code quality checks with pre-commit:
- Added pre-commit dependency to dev requirements
- Created .pre-commit-config.yaml with Black (formatter) and Mypy (type checker)
- Added Black and Mypy configuration to pyproject.toml
- Formatted all Python files with Black (42 files reformatted)

Pre-commit hooks now automatically:
- Format code with Black on every commit
- Type-check with Mypy (checking ace/ directory only)

This ensures consistent code style and catches type errors before they reach CI/CD.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Rename common.py → shared.py with enhanced docs
- Rename utils.py → debug.py for clarity
- Create form-filler/form_utils.py for consistency
- Update all imports across examples
- Add template function documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Add workflow diagram to README showing ACE data flow
- Simplify folder structure documentation
- Enhance TEMPLATE.py with better error handling and output capture
- Fix method name in ace_form_filler.py (to_file → save_to_file)
- Reduce test domains to 2 for faster testing

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- baseline-grocery-price-comparison.py: Full 3-store comparison (Migros, Coop, Aldi)
- test-baseline-grocery-price-comparison.py: Single-store test version (Migros only)

Features:
- Automated grocery shopping for 5 essential items across Swiss stores
- Price comparison with basket totals and item details
- Performance metrics tracking (steps, browser-use tokens)
- Regex parsing for structured agent output
- Console-only output following domain-checker demo pattern
- Claude Anthropic 4.5 model integration

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Changes:
- Pass only raw browser-use logs to reflector (no analysis/metrics)
- Clean up execution log collection (remove commentary)
- Increase max_tokens to 8192 for all ACE roles (Generator, Reflector, Curator)
- Fix AttributeError: bullet.helpful_count → bullet.helpful
- Prevents JSON truncation errors with large browser automation logs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Clean up online shopping demo by removing obsolete example files.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Remove old migros-specific demo files
- Add new consolidated ace-online-shopping.py and baseline-online-shopping.py demos
- Include results screenshot showing performance comparison

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
User and others added 29 commits November 19, 2025 18:51
- ACEBrowserUse → ACEAgent (actual class name)
- SimpleAgent → ACELiteLLM (actual class name)
- Add Out-of-Box Integrations section to README
- Remove broken OUT_OF_BOX_INTEGRATIONS.md link

Files updated:
- docs/INTEGRATION_GUIDE.md (2 fixes)
- docs/INTEGRATION_PATTERNS.md (1 fix + broken link)
- ace/integrations/base.py (2 fixes)
- ace/integrations/litellm.py (2 fixes)
- README.md (new section showcasing all 3 integrations)
Update all documentation to use "ACEAgent (browser-use)" naming for clarity.
This makes the purpose immediately clear alongside ACELiteLLM and ACELangChain.

Changes:
- README.md: Updated out-of-box integrations section
- INTEGRATION_GUIDE.md: Updated decision tree and references
- INTEGRATION_PATTERNS.md: Updated see also section
- ace/integrations/base.py: Updated module docstring
- ace/integrations/litellm.py: Updated class docstring references

No code changes, only documentation clarifications.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Add navigation READMEs to improve example discoverability:
- examples/README.md: Central index of all examples
- examples/starter-templates/README.md: Copy-paste templates guide
- examples/prompts/README.md: Prompt comparison guide

Update documentation to link to examples:
- README.md: Add specific example links in Documentation section
- INTEGRATION_GUIDE.md: Add Runnable Examples section

This solves the discoverability problem - users can now easily find
the right example to adapt for their use case.

Changes:
- New: examples/README.md (navigation hub)
- New: examples/starter-templates/README.md
- New: examples/prompts/README.md
- Updated: README.md (Documentation section)
- Updated: docs/INTEGRATION_GUIDE.md (added Examples section)

Total: 3 new files, 2 updated files (~180 lines total)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Update seahorse_emoji_ace.py to use ACELiteLLM integration
- Remove deprecated starter templates (langchain, ollama)
- Update quickstart_litellm.py with modern ACELiteLLM approach
- Add new litellm/ and ollama/ example directories
- Align examples with v0.4+ integration patterns

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Delete 5 internal planning files (~2,500 lines)
  - TODO.md, DEMO_TODO.md, ACE_ROADMAP.md
  - FINAL_ACE_SUMMARY.md, ACE_V2_1_IMPROVEMENTS.md
  - Agents.md (duplicate of QUICK_START)

- Simplify core documentation (-549 lines net)
  - COMPLETE_GUIDE_TO_ACE: 360→196 lines (-46%)
  - QUICK_START: 296→207 lines (-30%)
  - SETUP_GUIDE: 464→246 lines (-47%)
  - TESTING_GUIDE: 629→413 lines (-34%)

- Merge INTEGRATION_PATTERNS into INTEGRATION_GUIDE
  - Add 8 detailed integration patterns
  - Remove duplicate section
  - Update cross-references

- Enhance API documentation
  - Add integrations section (ACELiteLLM, ACEAgent, ACELangChain)
  - Update prompts_v2 → prompts_v2_1 references
  - Mark v2.0 prompts as deprecated
  - Add version comparison table

- Fix broken cross-references
  - Update 3 links to deleted files
  - Verify all example file references

Total: -626 lines, cleaner structure, up-to-date content
- Delete RELEASE_NOTES.md (outdated, duplicates CHANGELOG)
- Delete generated artifacts (hn_expert.json, custom_agent_learned.json, ace_example_output.log)
- Delete research folders (comparison_analysis/, other_ace_repos/)
- Update .gitignore: add *.log pattern to prevent future commits
- Add ace/integrations/ module documentation (key pattern)
- Clarify dual architecture: Full ACE vs Integration Pattern
- Add TOON format context (16-62% token savings)
- Document pre-commit hooks (Black + MyPy auto-run)
- Add concrete benchmark command examples
- Streamline prompt version guidance (v1.0 vs v2.1)
- Remove deprecated v2.0 references
- Delete empty research folders (comparative_analysis, real_ace_analysis, true_ace_comparison)
- Delete example artifact files (learned playbooks, checkpoints)
- Delete internal dev guide (rework-demo-guide.md)
- Clean build artifacts (.pyc, __pycache__, .DS_Store)

All changes properly documented in CHANGELOG.md [Unreleased] section.
- ACELiteLLM and ACELangChain integrations
- Integration exports from ace package root
- Documentation cleanup and examples reorganization
- SimpleAgent renamed to ACELiteLLM
- Update Quick Start to use ACELiteLLM (17 lines vs 52)
- Remove internal cleanup items from CHANGELOG
- Focus on user-facing features only
Matches common LangChain prompt template patterns
- Remove remaining README.md from starter-templates directory
- Remove quickstart_litellm.py (replaced by litellm/ examples)
- Complete example restructuring aligned with v0.5+ patterns

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Update image reference to point to correct location in domain-checker folder.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Restore LMstudio starter template with updated configuration
- Add comprehensive README with setup and troubleshooting guide
- Include LM Studio integration examples for ACE framework

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Add comprehensive, production-ready agent examples using local Ollama models:

1. Code Review Agent - Reviews code for bugs, security issues, and best practices
   - SQL injection, resource leaks, exception handling
   - Learns team-specific coding patterns

2. Data Analysis Agent - Analyzes data and generates actionable insights
   - Sales trends, anomaly detection, business metrics
   - Learns domain-specific analysis patterns

3. SQL Query Generator - Natural language to SQL translation
   - Complex joins, aggregations, subqueries
   - Learns database-specific query patterns

4. Troubleshooting Assistant - Diagnoses system issues
   - Memory leaks, performance issues, network problems
   - Learns environment-specific issues

5. Technical Writer Agent - Converts code to documentation
   - API docs, README files, changelogs
   - Learns company documentation style

Features:
- Each agent includes 6+ training samples
- Custom TaskEnvironment for domain-specific evaluation
- Before/after learning comparisons
- Real-world test cases
- Persistent playbooks for knowledge reuse
- Comprehensive README with setup and best practices

All examples use ACE learning to improve over time, demonstrating:
- Offline training with evaluation
- Learned strategy persistence
- Model recommendations (qwen2.5:7b, llama3.1:8b)
- Production-ready error handling

Run all demos: uv run python examples/ollama/run_all_demos.py
Add 5 additional production-ready agent examples using Ollama:

**New Agents:**

1. Test Case Generator (test_generator_agent.py) - Generates unit tests
   - Edge case detection
   - Pytest patterns
   - Mocking strategies
   - Learns team testing conventions

2. Email/Ticket Classifier (email_classifier_agent.py) - Support automation
   - Priority classification
   - Department routing
   - Intent recognition
   - Learns routing rules

3. Bug Report Analyzer (bug_report_agent.py) - Issue triage
   - Severity classification
   - Component assignment
   - Duplicate detection
   - Required information extraction

4. Git Commit Message Generator (commit_message_agent.py) - Conventional commits
   - Semantic versioning
   - Scope detection
   - Breaking change identification
   - Learns project conventions

5. Security Log Analyzer (security_log_agent.py) - Threat detection
   - Attack pattern recognition
   - False positive reduction
   - Incident severity
   - Response procedures

**Updates:**
- README now organized by category (Dev/Ops/Data/Security/Docs)
- run_all_demos.py updated to run all 10 agents
- All agents include 6-7 training samples
- Custom evaluation environments for each domain
- Real-world test cases demonstrating practical usage

**Total Agent Collection:**
- 10 production-ready agents across 5 domains
- Comprehensive examples for different industries
- ACE learning patterns for various use cases
- Complete with persistent playbooks

Run all: uv run python examples/ollama/run_all_demos.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants