add latency + LMarena score stats

dan-ince-aai · dan-ince-aai · commit f634782cf863 · 2025-10-24T16:06:52.000+01:00
diff --git a/fern/pages/07-llm-gateway/llm-gateway.mdx b/fern/pages/07-llm-gateway/llm-gateway.mdx
@@ -20,34 +20,34 @@ The LLM Gateway provides access to 15+ models across major AI providers with sup
 
 ### Anthropic Claude
 
-| Model                 | Parameter                    | Description                                            |
-| --------------------- | ---------------------------- | ------------------------------------------------------ |
-| **Claude 4.5 Sonnet** | `claude-sonnet-4-5-20250929` | Claude's best model for complex agents and coding      |
-| **Claude 4 Sonnet**   | `claude-sonnet-4-20250514`   | High-performance model                                 |
-| **Claude 4 Opus**     | `claude-opus-4-20250514`     | Claude's previous flagship model                       |
-| **Claude 4.5 Haiku**  | `claude-haiku-4-5-20251001`  | Claude's fastest and most intelligent Haiku model      |
-| **Claude 3.5 Haiku**  | `claude-3-5-haiku-20241022`  | Claude's fastest model                                 |
-| **Claude 3.0 Haiku**  | `claude-3-haiku-20240307`    | Fast and compact model for near-instant responsiveness |
+| Model                 | Parameter                    | Latency per 10,000 tokens | LMArena Score | Description                                            |
+| --------------------- | ---------------------------- | ------------------------- | ------------- | ------------------------------------------------------ |
+| **Claude 4.5 Sonnet** | `claude-sonnet-4-5-20250929` | 10.1s                     | 1438          | Claude's best model for complex agents and coding      |
+| **Claude 4 Sonnet**   | `claude-sonnet-4-20250514`   | 7.1s                      | 1389          | High-performance model                                 |
+| **Claude 4 Opus**     | `claude-opus-4-20250514`     | 15.4s                     | 1411          | Claude's previous flagship model                       |
+| **Claude 4.5 Haiku**  | `claude-haiku-4-5-20251001`  | 4.6s                      | 1397          | Claude's fastest and most intelligent Haiku model      |
+| **Claude 3.5 Haiku**  | `claude-3-5-haiku-20241022`  | 5.4s                      | 1320          | Fast and efficient model with strong performance       |
+| **Claude 3.0 Haiku**  | `claude-3-haiku-20240307`    | 4.8s                      | 1260          | Fast and compact model for near-instant responsiveness |
 
 ### OpenAI GPT
 
-| Model            | Parameter           | Description                                                      |
-| ---------------- | ------------------- | ---------------------------------------------------------------- |
-| **GPT-5**        | `gpt-5`             | OpenAI's best model for coding and agentic tasks across domains  |
-| **GPT-5 nano**   | `gpt-5-nano`        | OpenAI's fastest, most cost-efficient version of GPT-5           |
-| **GPT-5 mini**   | `gpt-5-mini`        | A faster, cost-efficient version of GPT-5 for well-defined tasks |
-| **GPT-4.1**      | `gpt-4.1`           | OpenAI's smartest non-reasoning model                            |
-| **ChatGPT-4o**   | `chatgpt-4o-latest` | GPT-4o model used in ChatGPT                                     |
-| **gpt-oss-120b** | `gpt-oss-120b`      | OpenAI's most powerful open-weight model                         |
-| **gpt-oss-20b**  | `gpt-oss-20b`       | Medium-sized open-weight model for low latency                   |
+| Model            | Parameter           | Latency per 10,000 tokens | LMArena Score | Description                                                      |
+| ---------------- | ------------------- | ------------------------- | ------------- | ---------------------------------------------------------------- |
+| **GPT-5**        | `gpt-5`             | 18.9s                     | 1425          | OpenAI's best model for coding and agentic tasks across domains  |
+| **GPT-5 nano**   | `gpt-5-nano`        | 11.2s                     | 1337          | OpenAI's fastest, most cost-efficient version of GPT-5           |
+| **GPT-5 mini**   | `gpt-5-mini`        | 21.9s                     | 1395          | A faster, cost-efficient version of GPT-5 for well-defined tasks |
+| **GPT-4.1**      | `gpt-4.1`           | 12.6s                     | 1411          | OpenAI's smartest non-reasoning model                            |
+| **ChatGPT-4o**   | `chatgpt-4o-latest` | 8.0s                      | 1440          | GPT-4o model used in ChatGPT                                     |
+| **gpt-oss-120b** | `gpt-oss-120b`      | 10.5s                     | 1348          | OpenAI's most powerful open-weight model                         |
+| **gpt-oss-20b**  | `gpt-oss-20b`       | 4.2s                      | 1317          | Medium-sized open-weight model for low latency                   |
 
 ### Google Gemini
 
-| Model                     | Parameter               | Description                                                                           |
-| ------------------------- | ----------------------- | ------------------------------------------------------------------------------------- |
-| **Gemini 2.5 Pro**        | `gemini-2.5-pro`        | Gemini's state-of-the-art thinking model, capable of reasoning over complex problems  |
-| **Gemini 2.5 Flash**      | `gemini-2.5-flash`      | Gemini's best model in terms of price-performance, offering well-rounded capabilities |
-| **Gemini 2.5 Flash-Lite** | `gemini-2.5-flash-lite` | Gemini's fastest flash model optimized for cost-efficiency and high throughput        |
+| Model                     | Parameter               | Latency per 10,000 tokens | LMArena Score | Description                                                                           |
+| ------------------------- | ----------------------- | ------------------------- | ------------- | ------------------------------------------------------------------------------------- |
+| **Gemini 2.5 Pro**        | `gemini-2.5-pro`        | 13.9s                     | 1451          | Gemini's state-of-the-art thinking model, capable of reasoning over complex problems  |
+| **Gemini 2.5 Flash**      | `gemini-2.5-flash`      | 8.3s                      | 1404          | Gemini's best model in terms of price-performance, offering well-rounded capabilities |
+| **Gemini 2.5 Flash-Lite** | `gemini-2.5-flash-lite` | 1.6s                      | 1374          | Gemini's fastest flash model optimized for cost-efficiency and high throughput        |
 
 Unsure which model to choose?