Skip to content

Commit 5217199

Browse files
authored
Add MinionS Protocol Example (#102)
add MinionS protocol example through Docker Compose with cost-efficient local-remote collaboration
1 parent 9ca7d6f commit 5217199

File tree

2 files changed

+206
-0
lines changed

2 files changed

+206
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ The demos support using OpenAI models instead of running models locally with Doc
5151
| [Spring AI](https://spring.io/projects/spring-ai) Brave Search | Single Agent | none | duckduckgo | [./spring-ai](./spring-ai) | [compose.yaml](./spring-ai/compose.yaml) |
5252
| [ADK](https://github.com/google/adk-python) Sock Store Agent | Multi-Agent | qwen3 | MongoDb, Brave, Curl, | [./adk-sock-shop](./adk-sock-shop/) | [compose.yaml](./adk-sock-shop/compose.yaml) |
5353
| [Langchaingo](https://github.com/tmc/langchaingo) DuckDuckGo Search | Single Agent | gemma3 | duckduckgo | [./langchaingo](./langchaingo) | [compose.yaml](./langchaingo/compose.yaml) |
54+
| [MinionS](https://github.com/HazyResearch/minions) Cost-Efficient Local-Remote Collaboration | Local-Remote Protocol | qwen3(local), gpt-4o(remote) | | [./minions](./minions) | [docker-compose.minions.yml](https://github.com/HazyResearch/minions/blob/main/apps/minions-docker/docker-compose.minions.yml) |
5455
5556
## License
5657

minions/README.md

Lines changed: 205 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,205 @@
1+
# 🧠 MinionS Protocol - Cost-Efficient Local-Remote LLM Collaboration
2+
3+
This example demonstrates the **MinionS protocol**, a groundbreaking approach for cost-efficient collaboration between
4+
small on-device models and large cloud models.
5+
Based on research from Stanford's Hazy Research lab, MinionS achieves **5.7× cost reduction**
6+
while maintaining **97.9% of cloud model performance**.
7+
8+
## 🚀 Getting Started
9+
10+
### Requirements
11+
12+
+ **[Docker Desktop] 4.43.0+ or [Docker Engine]** installed.
13+
+ **A laptop or workstation with a GPU** (e.g., a MacBook) for running open models locally. If you
14+
don't have a GPU, you can alternatively use **[Docker Offload]**.
15+
+ If you're using [Docker Engine] on Linux or [Docker Desktop] on Windows, ensure that the
16+
[Docker Model Runner requirements] are met (specifically that GPU
17+
support is enabled) and the necessary drivers are installed.
18+
+ If you're using Docker Engine on Linux, ensure you have [Docker Compose] 2.38.1 or later installed.
19+
+ An [OpenAI API Key](https://platform.openai.com/api-keys) 🔑.
20+
21+
### Quick Start
22+
23+
1. **Clone the official MinionS repository and navigate to the Docker setup:**
24+
25+
```bash
26+
git clone https://github.com/HazyResearch/minions.git
27+
cd minions/apps/minions-docker
28+
```
29+
30+
2. **Set your OpenAI API key:**
31+
32+
```bash
33+
export OPENAI_API_KEY=sk-your-key-here
34+
```
35+
36+
3. **Customize the model for better accuracy (recommended):**
37+
38+
Edit the `docker-compose.minions.yml` file to use qwen3 instead of llama3.2:
39+
40+
```yaml
41+
models:
42+
worker:
43+
model: ai/qwen3 # Changed from ai/llama3.2 for better accuracy (8B vs 3B params)
44+
context_size: 10000
45+
```
46+
47+
4. **Launch the MinionS protocol:**
48+
49+
```bash
50+
docker compose -f docker-compose.minions.yml up --build
51+
```
52+
53+
5. **Open your browser** and navigate to `http://localhost:8080` to access the interactive interface.
54+
55+
## 🧠 What is the MinionS Protocol?
56+
57+
The MinionS protocol enables **cost-efficient collaboration** between:
58+
59+
+ **Local Model** (on-device): Handles document reading, context processing, and initial analysis
60+
+ **Remote Model** (cloud): Provides supervision, final reasoning, and quality assurance
61+
62+
### Key Innovation: Decomposition Strategy
63+
64+
Unlike simple chat protocols, MinionS uses a sophisticated **decompose-execute-aggregate** approach:
65+
66+
1. **Decompose**: Remote model breaks complex tasks into simple, parallel subtasks
67+
2. **Execute**: Local model processes subtasks in parallel on document chunks
68+
3. **Aggregate**: Remote model synthesizes results and provides final answers
69+
70+
## 📊 Cost Analysis & Performance
71+
72+
### Academic Research Results
73+
74+
Based on the [Stanford research paper](https://arxiv.org/pdf/2502.15964), MinionS demonstrates:
75+
76+
| Protocol | Cost Reduction | Performance Recovery | Use Case |
77+
|----------|---------------|---------------------|----------|
78+
| **MinionS (8B local)** | **5.7× cheaper** | **97.9%** of remote performance | Production ready |
79+
| **MinionS (3B local)** | **6.0× cheaper** | **93.4%** of remote performance | Resource constrained |
80+
| Minion (simple chat) | 30.4× cheaper | 87.0% of remote performance | Basic tasks |
81+
82+
### Real-World Token Usage
83+
84+
**Research Paper Analysis Example:**
85+
86+
+ **Task**: "What are the three evaluation datasets used in the paper?"
87+
+ **Remote-only**: ~30,064 tokens
88+
+ **MinionS**: ~7,500-15,388 tokens
89+
+ **Savings**: 50-75% token reduction
90+
91+
## 🎯 Interactive Demo: Compare Remote vs MinionS
92+
93+
The MinionS interface includes a **toggle feature** that lets you compare:
94+
95+
### Remote-Only Mode
96+
97+
+ Processes the entire document with cloud model
98+
+ Higher token usage and cost
99+
+ Baseline performance
100+
101+
### MinionS Mode
102+
103+
+ Local model reads and processes document chunks
104+
+ Remote model provides supervision and final answers
105+
+ Dramatically reduced cloud costs
106+
+ Maintained quality
107+
108+
## 🎮 Step-by-Step Demo
109+
110+
### Example: Research Paper Analysis
111+
112+
1. **Start the system** following the Quick Start guide above
113+
114+
2. **Load the MinionS research paper** as your document:
115+
- Download: <https://arxiv.org/pdf/2502.15964>
116+
- Upload through the web interface
117+
118+
3. **Ask the example question**:
119+
120+
```text
121+
Task: "What are the three evaluation datasets used in the paper?"
122+
Document Metadata: "Research Paper"
123+
```
124+
125+
4. **Compare modes**:
126+
- **Remote-only**: Watch token usage (~30k tokens)
127+
- **MinionS**: See the dramatic reduction (~7.5-15k tokens)
128+
129+
5. **Expected answer**: "The three evaluation datasets are FinanceBench, LongHealth, and QASPER"
130+
131+
### Model Customization
132+
133+
**Recommended**: Upgrade from llama3.2 (3B) to qwen3 (8B) for better accuracy:
134+
135+
```yaml
136+
# In docker-compose.minions.yml
137+
models:
138+
worker:
139+
model: ai/qwen3 # 8B parameters - better accuracy
140+
# model: ai/llama3.2 # 3B parameters - faster download
141+
context_size: 10000
142+
```
143+
144+
**Trade-offs**:
145+
146+
+ **qwen3**: Slightly slower to download, significantly better accuracy
147+
+ **llama3.2**: Faster to pull, adequate for simple tasks
148+
149+
## 🤝 When to Use MinionS
150+
151+
### ✅ Ideal Use Cases
152+
153+
+ **Document Analysis**: Financial reports, medical records, research papers
154+
+ **Long Context Tasks**: Multi-page document processing
155+
+ **Cost-Sensitive Applications**: High-volume document processing
156+
+ **Privacy-Conscious**: Keep sensitive data local while leveraging cloud intelligence
157+
158+
## 🧹 Cleanup
159+
160+
To stop and remove containers:
161+
162+
```bash
163+
cd minions/apps/minions-docker
164+
docker compose -f docker-compose.minions.yml down -v
165+
```
166+
167+
## 📚 Additional Resources
168+
169+
### Official MinionS Resources
170+
171+
+ **Research Paper**: [Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models](https://arxiv.org/pdf/2502.15964)
172+
+ **GitHub Repository**: [HazyResearch/minions](https://github.com/HazyResearch/minions)
173+
+ **Docker Setup**: [minions-docker](https://github.com/HazyResearch/minions/tree/main/apps/minions-docker)
174+
175+
### Academic Citation
176+
177+
```bibtex
178+
@article{narayan2025minions,
179+
title={Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models},
180+
author={Narayan, Avanika and Biderman, Dan and Eyuboglu, Sabri and May, Avner and Linderman, Scott and Zou, James and R{\'e}, Christopher},
181+
journal={arXiv preprint arXiv:2502.15964},
182+
year={2025}
183+
}
184+
```
185+
186+
## 🏆 Key Benefits Summary
187+
188+
+ **💰 Cost Reduction**: 5.7× cheaper than remote-only processing
189+
+ **🎯 High Accuracy**: Maintains 97.9% of cloud model performance
190+
+ **🔧 Easy Customization**: Simple model swapping (llama3.2 → qwen3)
191+
192+
---
193+
194+
## 📎 Credits
195+
196+
+ **Research**: [Stanford Hazy Research Lab](https://hazyresearch.stanford.edu/)
197+
+ **Authors**: Avanika Narayan, Dan Biderman, Sabri Eyuboglu, and team
198+
+ **Implementation**: [HazyResearch/minions](https://github.com/HazyResearch/minions)
199+
+ **Docker Integration**: Compose for Agents community
200+
201+
[Docker Compose]: https://github.com/docker/compose
202+
[Docker Desktop]: https://www.docker.com/products/docker-desktop/
203+
[Docker Engine]: https://docs.docker.com/engine/
204+
[Docker Model Runner requirements]: https://docs.docker.com/ai/model-runner/
205+
[Docker Offload]: https://www.docker.com/products/docker-offload/

0 commit comments

Comments
 (0)