chore: remove model speed cookbook (#3058)

Wendong-Fan · web-flow · commit ad6848275a04 · 2025-08-22T21:49:57.000+08:00
diff --git a/docs/conf.py b/docs/conf.py
@@ -97,7 +97,6 @@
     "cookbooks/embodied_agents": "cookbooks/advanced_features/embodied_agents",
     "cookbooks/critic_agents_and_tree_search": "cookbooks/advanced_features/critic_agents_and_tree_search",
     "cookbooks/agents_society": "cookbooks/basic_concepts/create_your_first_agents_society",
-    "cookbooks/model_speed_comparison": "cookbooks/basic_concepts/model_speed_comparison",
     "cookbooks/agents_message": "cookbooks/basic_concepts/agents_message",
     "cookbooks/agents_with_tools": "cookbooks/advanced_features/agents_with_tools",
     "cookbooks/agents_with_memory": "cookbooks/advanced_features/agents_with_memory",
diff --git a/docs/cookbooks/basic_concepts/index.rst b/docs/cookbooks/basic_concepts/index.rst
@@ -14,4 +14,3 @@ Basic Concepts
    create_your_first_agents_society
    agents_message
    agents_prompting
-   model_speed_comparison
diff --git a/docs/cookbooks/basic_concepts/model_speed_comparison.ipynb b/docs/cookbooks/basic_concepts/model_speed_comparison.ipynb
diff --git a/docs/key_modules/models.md b/docs/key_modules/models.md
@@ -501,61 +501,6 @@ CAMEL-AI makes it easy to integrate local open-source models as part of your age
   Explore the full <b>CAMEL-AI Examples</b> library for advanced workflows, tool integrations, and multi-agent demos.
 </Card>
 
-
-## Model Speed and Performance
-
-<CardGroup cols={2}>
-  <Card
-    title="Why Model Speed Matters"
-    icon="gauge-simple-max"
-  >
-    For interactive AI applications, response speed can make or break the user experience. CAMEL-AI benchmarks tokens processed per second (TPS) across a range of supported models—helping you choose the right balance of power and performance.
-  </Card>
-  <Card
-    title="Benchmark Insights"
-    icon="ranking-star"
-  >
-    We ran side-by-side tests in <a href="../cookbooks/model_speed_comparison.ipynb" target="_blank"><b>this notebook</b></a> comparing top models from OpenAI (GPT-4o Mini, GPT-4o, O1 Preview) and SambaNova (Llama series), measuring output speed in tokens per second.
-  </Card>
-</CardGroup>
-
-<Note type="info">
-<b>Key Findings:</b>
-<ul>
-  <li>
-    <b>Small models = blazing speed:</b> SambaNova’s Llama 8B and OpenAI GPT-4o Mini deliver the fastest responses.
-  </li>
-  <li>
-    <b>Bigger models = higher quality, slower output:</b> Llama 405B (SambaNova) and similar large models trade off speed for more nuanced reasoning.
-  </li>
-  <li>
-    <b>OpenAI models = consistent speed:</b> Most OpenAI models maintain stable throughput across use cases.
-  </li>
-  <li>
-    <b>Llama 8B (SambaNova) = top performer:</b> Outpaces others in raw tokens/sec.
-  </li>
-</ul>
-</Note>
-
-<div align="center" style={{ margin: "1.5em 0" }}>
-  <img
-    src="https://i.postimg.cc/4xByytyZ/model-speed.png"
-    alt="Model Speed Comparison: Tokens per second for various AI models"
-    style={{ maxWidth: 520, borderRadius: 16, boxShadow: "0 2px 8px #0002" }}
-  />
-  <div style={{ fontSize: 13, color: "#666", marginTop: 8 }}>
-    <i>Model Speed Comparison &mdash; tokens per second across CAMEL-supported models</i>
-  </div>
-</div>
-
-<Card title="Local Inference: vLLM vs. SGLang" icon="terminal">
-  We compared local inference speeds between <b>vLLM</b> and <b>SGLang</b> on the same hardware. SGLang (meta-llama/Llama-3.2-1B-Instruct) hit <b>220.98 tokens/sec</b>, while vLLM peaked at <b>107.2 tokens/sec</b>.<br/>
-  <br/>
-  <b>Bottom line:</b> For maximum speed in local environments, SGLang currently leads.
-</Card>
-
----
-
 ## Next Steps
 
 You’ve now seen how to connect, configure, and optimize models with CAMEL-AI.
diff --git a/docs/mintlify/docs.json b/docs/mintlify/docs.json
@@ -99,7 +99,6 @@
                   "cookbooks/basic_concepts/agents_prompting",
                   "cookbooks/basic_concepts/create_your_first_agent",
                   "cookbooks/basic_concepts/create_your_first_agents_society",
-                  "cookbooks/basic_concepts/model_speed_comparison"
                 ]
               },
               {
diff --git a/docs/mintlify/key_modules/models.mdx b/docs/mintlify/key_modules/models.mdx
@@ -501,61 +501,6 @@ CAMEL-AI makes it easy to integrate local open-source models as part of your age
   Explore the full <b>CAMEL-AI Examples</b> library for advanced workflows, tool integrations, and multi-agent demos.
 </Card>
 
-
-## Model Speed and Performance
-
-<CardGroup cols={2}>
-  <Card
-    title="Why Model Speed Matters"
-    icon="gauge-simple-max"
-  >
-    For interactive AI applications, response speed can make or break the user experience. CAMEL-AI benchmarks tokens processed per second (TPS) across a range of supported models—helping you choose the right balance of power and performance.
-  </Card>
-  <Card
-    title="Benchmark Insights"
-    icon="ranking-star"
-  >
-    We ran side-by-side tests in <a href="../cookbooks/model_speed_comparison.ipynb" target="_blank"><b>this notebook</b></a> comparing top models from OpenAI (GPT-4o Mini, GPT-4o, O1 Preview) and SambaNova (Llama series), measuring output speed in tokens per second.
-  </Card>
-</CardGroup>
-
-<Note type="info">
-<b>Key Findings:</b>
-<ul>
-  <li>
-    <b>Small models = blazing speed:</b> SambaNova’s Llama 8B and OpenAI GPT-4o Mini deliver the fastest responses.
-  </li>
-  <li>
-    <b>Bigger models = higher quality, slower output:</b> Llama 405B (SambaNova) and similar large models trade off speed for more nuanced reasoning.
-  </li>
-  <li>
-    <b>OpenAI models = consistent speed:</b> Most OpenAI models maintain stable throughput across use cases.
-  </li>
-  <li>
-    <b>Llama 8B (SambaNova) = top performer:</b> Outpaces others in raw tokens/sec.
-  </li>
-</ul>
-</Note>
-
-<div align="center" style={{ margin: "1.5em 0" }}>
-  <img
-    src="https://i.postimg.cc/4xByytyZ/model-speed.png"
-    alt="Model Speed Comparison: Tokens per second for various AI models"
-    style={{ maxWidth: 520, borderRadius: 16, boxShadow: "0 2px 8px #0002" }}
-  />
-  <div style={{ fontSize: 13, color: "#666", marginTop: 8 }}>
-    <i>Model Speed Comparison &mdash; tokens per second across CAMEL-supported models</i>
-  </div>
-</div>
-
-<Card title="Local Inference: vLLM vs. SGLang" icon="terminal">
-  We compared local inference speeds between <b>vLLM</b> and <b>SGLang</b> on the same hardware. SGLang (meta-llama/Llama-3.2-1B-Instruct) hit <b>220.98 tokens/sec</b>, while vLLM peaked at <b>107.2 tokens/sec</b>.<br/>
-  <br/>
-  <b>Bottom line:</b> For maximum speed in local environments, SGLang currently leads.
-</Card>
-
----
-
 ## Next Steps
 
 You’ve now seen how to connect, configure, and optimize models with CAMEL-AI.

Original file line number	Diff line number	Diff line change
`@@ -99,7 +99,6 @@`
`99`	`99`	`"cookbooks/basic_concepts/agents_prompting",`
`100`	`100`	`"cookbooks/basic_concepts/create_your_first_agent",`
`101`	`101`	`"cookbooks/basic_concepts/create_your_first_agents_society",`
`102`		`- "cookbooks/basic_concepts/model_speed_comparison"`
`103`	`102`	`]`
`104`	`103`	`},`
`105`	`104`	`{`