Skip to content

Commit ad68482

Browse files
authored
chore: remove model speed cookbook (#3058)
1 parent 4cee64d commit ad68482

File tree

6 files changed

+0
-407
lines changed

6 files changed

+0
-407
lines changed

docs/conf.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,6 @@
9797
"cookbooks/embodied_agents": "cookbooks/advanced_features/embodied_agents",
9898
"cookbooks/critic_agents_and_tree_search": "cookbooks/advanced_features/critic_agents_and_tree_search",
9999
"cookbooks/agents_society": "cookbooks/basic_concepts/create_your_first_agents_society",
100-
"cookbooks/model_speed_comparison": "cookbooks/basic_concepts/model_speed_comparison",
101100
"cookbooks/agents_message": "cookbooks/basic_concepts/agents_message",
102101
"cookbooks/agents_with_tools": "cookbooks/advanced_features/agents_with_tools",
103102
"cookbooks/agents_with_memory": "cookbooks/advanced_features/agents_with_memory",

docs/cookbooks/basic_concepts/index.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,4 +14,3 @@ Basic Concepts
1414
create_your_first_agents_society
1515
agents_message
1616
agents_prompting
17-
model_speed_comparison

docs/cookbooks/basic_concepts/model_speed_comparison.ipynb

Lines changed: 0 additions & 294 deletions
This file was deleted.

docs/key_modules/models.md

Lines changed: 0 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -501,61 +501,6 @@ CAMEL-AI makes it easy to integrate local open-source models as part of your age
501501
Explore the full <b>CAMEL-AI Examples</b> library for advanced workflows, tool integrations, and multi-agent demos.
502502
</Card>
503503
504-
505-
## Model Speed and Performance
506-
507-
<CardGroup cols={2}>
508-
<Card
509-
title="Why Model Speed Matters"
510-
icon="gauge-simple-max"
511-
>
512-
For interactive AI applications, response speed can make or break the user experience. CAMEL-AI benchmarks tokens processed per second (TPS) across a range of supported models—helping you choose the right balance of power and performance.
513-
</Card>
514-
<Card
515-
title="Benchmark Insights"
516-
icon="ranking-star"
517-
>
518-
We ran side-by-side tests in <a href="../cookbooks/model_speed_comparison.ipynb" target="_blank"><b>this notebook</b></a> comparing top models from OpenAI (GPT-4o Mini, GPT-4o, O1 Preview) and SambaNova (Llama series), measuring output speed in tokens per second.
519-
</Card>
520-
</CardGroup>
521-
522-
<Note type="info">
523-
<b>Key Findings:</b>
524-
<ul>
525-
<li>
526-
<b>Small models = blazing speed:</b> SambaNova’s Llama 8B and OpenAI GPT-4o Mini deliver the fastest responses.
527-
</li>
528-
<li>
529-
<b>Bigger models = higher quality, slower output:</b> Llama 405B (SambaNova) and similar large models trade off speed for more nuanced reasoning.
530-
</li>
531-
<li>
532-
<b>OpenAI models = consistent speed:</b> Most OpenAI models maintain stable throughput across use cases.
533-
</li>
534-
<li>
535-
<b>Llama 8B (SambaNova) = top performer:</b> Outpaces others in raw tokens/sec.
536-
</li>
537-
</ul>
538-
</Note>
539-
540-
<div align="center" style={{ margin: "1.5em 0" }}>
541-
<img
542-
src="https://i.postimg.cc/4xByytyZ/model-speed.png"
543-
alt="Model Speed Comparison: Tokens per second for various AI models"
544-
style={{ maxWidth: 520, borderRadius: 16, boxShadow: "0 2px 8px #0002" }}
545-
/>
546-
<div style={{ fontSize: 13, color: "#666", marginTop: 8 }}>
547-
<i>Model Speed Comparison &mdash; tokens per second across CAMEL-supported models</i>
548-
</div>
549-
</div>
550-
551-
<Card title="Local Inference: vLLM vs. SGLang" icon="terminal">
552-
We compared local inference speeds between <b>vLLM</b> and <b>SGLang</b> on the same hardware. SGLang (meta-llama/Llama-3.2-1B-Instruct) hit <b>220.98 tokens/sec</b>, while vLLM peaked at <b>107.2 tokens/sec</b>.<br/>
553-
<br/>
554-
<b>Bottom line:</b> For maximum speed in local environments, SGLang currently leads.
555-
</Card>
556-
557-
---
558-
559504
## Next Steps
560505

561506
You’ve now seen how to connect, configure, and optimize models with CAMEL-AI.

docs/mintlify/docs.json

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,6 @@
9999
"cookbooks/basic_concepts/agents_prompting",
100100
"cookbooks/basic_concepts/create_your_first_agent",
101101
"cookbooks/basic_concepts/create_your_first_agents_society",
102-
"cookbooks/basic_concepts/model_speed_comparison"
103102
]
104103
},
105104
{

docs/mintlify/key_modules/models.mdx

Lines changed: 0 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -501,61 +501,6 @@ CAMEL-AI makes it easy to integrate local open-source models as part of your age
501501
Explore the full <b>CAMEL-AI Examples</b> library for advanced workflows, tool integrations, and multi-agent demos.
502502
</Card>
503503

504-
505-
## Model Speed and Performance
506-
507-
<CardGroup cols={2}>
508-
<Card
509-
title="Why Model Speed Matters"
510-
icon="gauge-simple-max"
511-
>
512-
For interactive AI applications, response speed can make or break the user experience. CAMEL-AI benchmarks tokens processed per second (TPS) across a range of supported models—helping you choose the right balance of power and performance.
513-
</Card>
514-
<Card
515-
title="Benchmark Insights"
516-
icon="ranking-star"
517-
>
518-
We ran side-by-side tests in <a href="../cookbooks/model_speed_comparison.ipynb" target="_blank"><b>this notebook</b></a> comparing top models from OpenAI (GPT-4o Mini, GPT-4o, O1 Preview) and SambaNova (Llama series), measuring output speed in tokens per second.
519-
</Card>
520-
</CardGroup>
521-
522-
<Note type="info">
523-
<b>Key Findings:</b>
524-
<ul>
525-
<li>
526-
<b>Small models = blazing speed:</b> SambaNova’s Llama 8B and OpenAI GPT-4o Mini deliver the fastest responses.
527-
</li>
528-
<li>
529-
<b>Bigger models = higher quality, slower output:</b> Llama 405B (SambaNova) and similar large models trade off speed for more nuanced reasoning.
530-
</li>
531-
<li>
532-
<b>OpenAI models = consistent speed:</b> Most OpenAI models maintain stable throughput across use cases.
533-
</li>
534-
<li>
535-
<b>Llama 8B (SambaNova) = top performer:</b> Outpaces others in raw tokens/sec.
536-
</li>
537-
</ul>
538-
</Note>
539-
540-
<div align="center" style={{ margin: "1.5em 0" }}>
541-
<img
542-
src="https://i.postimg.cc/4xByytyZ/model-speed.png"
543-
alt="Model Speed Comparison: Tokens per second for various AI models"
544-
style={{ maxWidth: 520, borderRadius: 16, boxShadow: "0 2px 8px #0002" }}
545-
/>
546-
<div style={{ fontSize: 13, color: "#666", marginTop: 8 }}>
547-
<i>Model Speed Comparison &mdash; tokens per second across CAMEL-supported models</i>
548-
</div>
549-
</div>
550-
551-
<Card title="Local Inference: vLLM vs. SGLang" icon="terminal">
552-
We compared local inference speeds between <b>vLLM</b> and <b>SGLang</b> on the same hardware. SGLang (meta-llama/Llama-3.2-1B-Instruct) hit <b>220.98 tokens/sec</b>, while vLLM peaked at <b>107.2 tokens/sec</b>.<br/>
553-
<br/>
554-
<b>Bottom line:</b> For maximum speed in local environments, SGLang currently leads.
555-
</Card>
556-
557-
---
558-
559504
## Next Steps
560505

561506
You’ve now seen how to connect, configure, and optimize models with CAMEL-AI.

0 commit comments

Comments
 (0)