BROKE Cluster - Frequently Asked Questions #5
mzau
announced in
Comprehensive FAQ
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
BROKE Cluster - Frequently Asked Questions
Comprehensive Q&A for understanding BROKE's positioning and technical decisions
Architecture & Positioning
Q: How does BROKE compare to LLM routing frameworks like RouteLLM/OptiLLM?
A: BROKE and RouteLLM solve different problems with complementary approaches:
RouteLLM/OptiLLM (Cloud-to-Cloud Cost Optimization):
BROKE Cluster (Local-First with Strategic Cloud Integration):
Theoretical Cost Comparison:
When to choose what:
Q: What happens when the team can't converge on a solution locally?
A: That's outside BROKE's core responsibility - it's handled at the use-case level:
BROKE's Scope: Intelligent routing of individual queries to optimal local models with quality guidance. BROKE doesn't manage team workflows or decide when to escalate to external resources.
Where Escalation Happens: In your use-case implementation (e.g., CrewAI setup, custom workflows):
BROKE's Contribution: Provides quality indicators (🟢🟡🟠🔴) that help teams and frameworks make informed decisions about when local solutions might be insufficient.
Example in Practice:
Philosophy: BROKE focuses on being the best possible local router. Escalation strategies are use-case specific and should remain flexible.
Q: How does BROKE handle multi-agent frameworks like CrewAI?
A: BROKE is designed as the intelligent backend for agent orchestration:
Universal Compatibility:
Agent-Optimized Features:
Example Agent Assignment Flexibility:
BROKE's Value: Intelligent routing reduces need for manual agent-to-model assignments while providing transparency for agent decision-making.
Technical Architecture
Q: How does BROKE's multi-dimensional complexity work?
A: Evolution from 1D to 3D complexity understanding:
Previous Approach (Linear 0.0-1.0 scale):
Current Approach (Multi-dimensional ML):
Quality Temperature System:
Benefits:
Q: What is N:1 Model Tier Support?
A: Flexible model assignment within quality tiers:
Traditional 1:1 Mapping:
BROKE's N:1 Flexible Tiers:
Intelligent Selection Within Tier:
Configuration-Driven:
Q: How does Universal Data Collection work?
A: MultiPL-E compatible training data from all interactions:
Data Format (compatible with standard benchmarks):
Collection Sources:
make benchmark
)Benefits:
Implementation & Usage
Q: What's the current status and roadmap?
A: Alpha functional, architectural research phase:
Current Status (v1.5.8-beta):
Next Steps (v1.5.9):
Vision (v1.6.0+):
Q: What hardware do I need to get started?
A: Minimum viable setup for testing:
Minimal Setup (1-3 users):
Recommended Team Setup (5-10 users):
Future-Proof Investment:
Q: I want to try this TODAY. What can I do?
A: Start with MLX Knife - the production-ready foundation component:
Quick Start:
git clone https://github.com/mzau/mlx-knife cd mlx-knife
pip install -r requirements.txt
python -m mlx_knife list
python -m mlx_knife server --port 8000
open simple_chat.html
(or navigate in browser)What you get:
Next Steps:
mlxk run mistral-7b-instruct "your prompt"
mlxk health mistral-7b-instruct
Reality Check: This is what BROKE Cluster will intelligently route between multiple nodes. MLX Knife gives you the single-node experience today while BROKE's multi-node orchestration is in alpha development.
Q: How do I contribute or get involved?
A: Multiple ways to engage:
For Users:
For Alpha Testers:
For Contributors:
For Researchers:
Comparisons & Context
Q: How is this different from Ollama?
A: Ollama is great for single machines. BROKE focuses on multi-node routing and agent orchestration:
Ollama (Excellent single-node management):
BROKE (Multi-node intelligent orchestration):
Platform Strategy:
Currently optimized for MLX/Apple Silicon, but the architecture is designed to adapt to evolving hardware platforms.
Complementary Usage:
Q: Will local LLMs be competitive with cloud models by the time BROKE reaches production?
A: This is the fundamental uncertainty driving our development approach:
The Core Question:
Will local hardware advances (Apple M-series, unified memory, MLX optimization) keep pace with cloud model improvements by 2026-2027 timeframe?
Optimistic Scenario (BROKE team's working hypothesis):
Pessimistic Scenario (realistic possibility):
Pragmatic Approach:
We're building the infrastructure for whatever hardware delivers, rather than betting on specific performance outcomes. Key principles:
Reality Check: Software will likely be ready before we know what hardware can truly deliver. We're building the routing and orchestration layer for whatever the local/cloud balance becomes.
Q: Why focus on Apple Silicon / MLX specifically?
A: Strategic platform choice, but architecture is hardware-agnostic:
See detailed hardware strategy discussion above. In short: unified memory advantages today, but BROKE adapts to whatever platform delivers best price/performance for teams.
Q: Will it support NVIDIA GPUs?
A: Future platform support depends on unified memory evolution:
Current Focus: Apple Silicon + MLX (proven unified memory advantage)
Future Candidates: NVIDIA (when unified memory matures), AMD (strong potential)
Decision Criteria: Whatever delivers best price/performance for small teams
Q: What about Kubernetes/Docker?
A: We're building something simpler:
Think "Kubernetes ideas without Kubernetes complexity." BROKE focuses on intelligent routing rather than container orchestration, targeting small teams (5-10 users) rather than enterprise-scale deployments.
Philosophy: Right-sized complexity for the problem - team coordination, not data center management.
Q: Can I use this for production?
A: Not yet - alpha software with production timeline 2026 H2:
Current Status: Functional alpha with architectural research ongoing
Target Timeline: Production-ready features expected 2026 H2
Dependencies: Apple Silicon hardware evolution + LLM performance improvements
Use Today For: Research, development, alpha testing with realistic expectations
Q: Why "BROKE"?
A: Because after buying all these Macs, we're broke! 🦫
As for what BROKE stands for... is it "BROKE Runs On Keen Efficiency"? "Building Robust Orchestrated Knowledge Engines"? "Beaver's Recursive Optimization of Keen Excellence"? The community is still debating! 😄
The beaver mascot represents building dams to keep your data from flowing to the cloud.
Contact & Community
Questions not covered here?
Stay Updated:
Last Updated: 2025-08-14 - Living document, updated as project evolves
Beta Was this translation helpful? Give feedback.
All reactions