Skip to content

Commit 46a64b8

Browse files
committed
add MinionS protocol example through Docker Compose with cost-efficient local-remote collaboration
1 parent 9ca7d6f commit 46a64b8

File tree

2 files changed

+201
-0
lines changed

2 files changed

+201
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ The demos support using OpenAI models instead of running models locally with Doc
5151
| [Spring AI](https://spring.io/projects/spring-ai) Brave Search | Single Agent | none | duckduckgo | [./spring-ai](./spring-ai) | [compose.yaml](./spring-ai/compose.yaml) |
5252
| [ADK](https://github.com/google/adk-python) Sock Store Agent | Multi-Agent | qwen3 | MongoDb, Brave, Curl, | [./adk-sock-shop](./adk-sock-shop/) | [compose.yaml](./adk-sock-shop/compose.yaml) |
5353
| [Langchaingo](https://github.com/tmc/langchaingo) DuckDuckGo Search | Single Agent | gemma3 | duckduckgo | [./langchaingo](./langchaingo) | [compose.yaml](./langchaingo/compose.yaml) |
54+
| [MinionS](https://github.com/HazyResearch/minions) Cost-Efficient Local-Remote Collaboration | Local-Remote Protocol | qwen3(local), gpt-4o(remote) | | [./minions](./minions) | [docker-compose.minions.yml](https://github.com/HazyResearch/minions/blob/main/apps/minions-docker/docker-compose.minions.yml) |
5455
5556
## License
5657

minions/README.md

Lines changed: 200 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,200 @@
1+
# 🧠 MinionS Protocol - Cost-Efficient Local-Remote LLM Collaboration
2+
3+
This example demonstrates the **MinionS protocol**, a groundbreaking approach for cost-efficient collaboration between small on-device models and large cloud models. Based on research from Stanford's Hazy Research lab, MinionS achieves **5.7× cost reduction** while maintaining **97.9% of cloud model performance**.
4+
5+
> [!Tip]
6+
> **Real Cost Savings**: In practice, tasks that consume ~30,000 tokens with remote-only processing use only ~7,500-15,000 tokens with MinionS - that's **50-75% cost reduction**!
7+
8+
<p>
9+
<img src="https://github.com/HazyResearch/minions/raw/main/assets/Ollama_minionS_background.png"
10+
alt="MinionS Protocol Overview"
11+
width="600"
12+
style="border: 1px solid #ccc; border-radius: 8px;" />
13+
</p>
14+
15+
## 🚀 Getting Started
16+
17+
### Requirements
18+
19+
+ **[Docker Desktop] 4.43.0+ or [Docker Engine]** installed.
20+
+ **A laptop or workstation with a GPU** (e.g., a MacBook) for running open models locally. If you don't have a GPU, you can alternatively use **[Docker Offload]**.
21+
+ If you're using [Docker Engine] on Linux or [Docker Desktop] on Windows, ensure that the [Docker Model Runner requirements] are met (specifically that GPU support is enabled) and the necessary drivers are installed.
22+
+ If you're using Docker Engine on Linux, ensure you have [Docker Compose] 2.38.1 or later installed.
23+
+ An [OpenAI API Key](https://platform.openai.com/api-keys) 🔑.
24+
25+
### Quick Start
26+
27+
1. **Clone the official MinionS repository and navigate to the Docker setup:**
28+
29+
```bash
30+
git clone https://github.com/HazyResearch/minions.git
31+
cd minions/apps/minions-docker
32+
```
33+
34+
2. **Set your OpenAI API key:**
35+
36+
```bash
37+
export OPENAI_API_KEY=sk-your-key-here
38+
```
39+
40+
3. **Customize the model for better accuracy (recommended):**
41+
42+
Edit the `docker-compose.minions.yml` file to use qwen3 instead of llama3.2:
43+
44+
```yaml
45+
models:
46+
worker:
47+
model: ai/qwen3 # Changed from ai/llama3.2 for better accuracy (8B vs 3B params)
48+
context_size: 10000
49+
```
50+
51+
4. **Launch the MinionS protocol:**
52+
53+
```bash
54+
docker compose -f docker-compose.minions.yml up --build
55+
```
56+
57+
5. **Open your browser** and navigate to `http://localhost:8080` to access the interactive interface.
58+
59+
## 🧠 What is the MinionS Protocol?
60+
61+
The MinionS protocol enables **cost-efficient collaboration** between:
62+
- **Local Model** (on-device): Handles document reading, context processing, and initial analysis
63+
- **Remote Model** (cloud): Provides supervision, final reasoning, and quality assurance
64+
65+
### Key Innovation: Decomposition Strategy
66+
67+
Unlike simple chat protocols, MinionS uses a sophisticated **decompose-execute-aggregate** approach:
68+
69+
1. **Decompose**: Remote model breaks complex tasks into simple, parallel subtasks
70+
2. **Execute**: Local model processes subtasks in parallel on document chunks
71+
3. **Aggregate**: Remote model synthesizes results and provides final answers
72+
73+
## 📊 Cost Analysis & Performance
74+
75+
### Academic Research Results
76+
77+
Based on the [Stanford research paper](https://arxiv.org/pdf/2502.15964), MinionS demonstrates:
78+
79+
| Protocol | Cost Reduction | Performance Recovery | Use Case |
80+
|----------|---------------|---------------------|----------|
81+
| **MinionS (8B local)** | **5.7× cheaper** | **97.9%** of remote performance | Production ready |
82+
| **MinionS (3B local)** | **6.0× cheaper** | **93.4%** of remote performance | Resource constrained |
83+
| Minion (simple chat) | 30.4× cheaper | 87.0% of remote performance | Basic tasks |
84+
85+
### Real-World Token Usage
86+
87+
**Research Paper Analysis Example:**
88+
- **Task**: "What are the three evaluation datasets used in the paper?"
89+
- **Remote-only**: ~30,064 tokens
90+
- **MinionS**: ~7,500-15,388 tokens
91+
- **Savings**: 50-75% token reduction
92+
93+
## 🎯 Interactive Demo: Compare Remote vs MinionS
94+
95+
The MinionS interface includes a **toggle feature** that lets you compare:
96+
97+
### Remote-Only Mode
98+
- Processes entire document with cloud model
99+
- Higher token usage and cost
100+
- Baseline performance
101+
102+
### MinionS Mode
103+
- Local model reads and processes document chunks
104+
- Remote model provides supervision and final answers
105+
- Dramatically reduced cloud costs
106+
- Maintained quality
107+
108+
## 🎮 Step-by-Step Demo
109+
110+
### Example: Research Paper Analysis
111+
112+
1. **Start the system** following the Quick Start guide above
113+
114+
2. **Load the MinionS research paper** as your document:
115+
- Download: https://arxiv.org/pdf/2502.15964
116+
- Upload through the web interface
117+
118+
3. **Ask the example question**:
119+
```
120+
Task: "What are the three evaluation datasets used in the paper?"
121+
Document Metadata: "Research Paper"
122+
```
123+
124+
4. **Compare modes**:
125+
- **Remote-only**: Watch token usage (~30k tokens)
126+
- **MinionS**: See the dramatic reduction (~7.5-15k tokens)
127+
128+
5. **Expected answer**: "The three evaluation datasets are FinanceBench, LongHealth, and QASPER"
129+
130+
### Model Customization
131+
132+
**Recommended**: Upgrade from llama3.2 (3B) to qwen3 (8B) for better accuracy:
133+
134+
```yaml
135+
# In docker-compose.minions.yml
136+
models:
137+
worker:
138+
model: ai/qwen3 # 8B parameters - better accuracy
139+
# model: ai/llama3.2 # 3B parameters - faster download
140+
context_size: 10000
141+
```
142+
143+
**Trade-offs**:
144+
- **qwen3**: Slightly slower to download, significantly better accuracy
145+
- **llama3.2**: Faster to pull, adequate for simple tasks
146+
147+
## 🤝 When to Use MinionS
148+
149+
### ✅ Ideal Use Cases
150+
- **Document Analysis**: Financial reports, medical records, research papers
151+
- **Long Context Tasks**: Multi-page document processing
152+
- **Cost-Sensitive Applications**: High-volume document processing
153+
- **Privacy-Conscious**: Keep sensitive data local while leveraging cloud intelligence
154+
155+
## 🧹 Cleanup
156+
157+
To stop and remove containers:
158+
159+
```bash
160+
cd minions/apps/minions-docker
161+
docker compose -f docker-compose.minions.yml down -v
162+
```
163+
164+
## 📚 Additional Resources
165+
166+
### Official MinionS Resources
167+
- **Research Paper**: [Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models](https://arxiv.org/pdf/2502.15964)
168+
- **GitHub Repository**: [HazyResearch/minions](https://github.com/HazyResearch/minions)
169+
- **Docker Setup**: [minions-docker](https://github.com/HazyResearch/minions/tree/main/apps/minions-docker)
170+
171+
### Academic Citation
172+
```bibtex
173+
@article{narayan2025minions,
174+
title={Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models},
175+
author={Narayan, Avanika and Biderman, Dan and Eyuboglu, Sabri and May, Avner and Linderman, Scott and Zou, James and R{\'e}, Christopher},
176+
journal={arXiv preprint arXiv:2502.15964},
177+
year={2025}
178+
}
179+
```
180+
181+
## 🏆 Key Benefits Summary
182+
183+
- **💰 Cost Reduction**: 5.7× cheaper than remote-only processing
184+
- **🎯 High Accuracy**: Maintains 97.9% of cloud model performance
185+
- **🔧 Easy Customization**: Simple model swapping (llama3.2 → qwen3)
186+
187+
---
188+
189+
## 📎 Credits
190+
191+
- **Research**: [Stanford Hazy Research Lab](https://hazyresearch.stanford.edu/)
192+
- **Authors**: Avanika Narayan, Dan Biderman, Sabri Eyuboglu, and team
193+
- **Implementation**: [HazyResearch/minions](https://github.com/HazyResearch/minions)
194+
- **Docker Integration**: Compose for Agents community
195+
196+
[Docker Compose]: https://github.com/docker/compose
197+
[Docker Desktop]: https://www.docker.com/products/docker-desktop/
198+
[Docker Engine]: https://docs.docker.com/engine/
199+
[Docker Model Runner requirements]: https://docs.docker.com/ai/model-runner/
200+
[Docker Offload]: https://www.docker.com/products/docker-offload/

0 commit comments

Comments
 (0)