Skip to content

Commit f634782

Browse files
committed
add latency + LMarena score stats
1 parent 812d33c commit f634782

File tree

1 file changed

+22
-22
lines changed

1 file changed

+22
-22
lines changed

fern/pages/07-llm-gateway/llm-gateway.mdx

Lines changed: 22 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -20,34 +20,34 @@ The LLM Gateway provides access to 15+ models across major AI providers with sup
2020

2121
### Anthropic Claude
2222

23-
| Model | Parameter | Description |
24-
| --------------------- | ---------------------------- | ------------------------------------------------------ |
25-
| **Claude 4.5 Sonnet** | `claude-sonnet-4-5-20250929` | Claude's best model for complex agents and coding |
26-
| **Claude 4 Sonnet** | `claude-sonnet-4-20250514` | High-performance model |
27-
| **Claude 4 Opus** | `claude-opus-4-20250514` | Claude's previous flagship model |
28-
| **Claude 4.5 Haiku** | `claude-haiku-4-5-20251001` | Claude's fastest and most intelligent Haiku model |
29-
| **Claude 3.5 Haiku** | `claude-3-5-haiku-20241022` | Claude's fastest model |
30-
| **Claude 3.0 Haiku** | `claude-3-haiku-20240307` | Fast and compact model for near-instant responsiveness |
23+
| Model | Parameter | Latency per 10,000 tokens | LMArena Score | Description |
24+
| --------------------- | ---------------------------- | ------------------------- | ------------- | ------------------------------------------------------ |
25+
| **Claude 4.5 Sonnet** | `claude-sonnet-4-5-20250929` | 10.1s | 1438 | Claude's best model for complex agents and coding |
26+
| **Claude 4 Sonnet** | `claude-sonnet-4-20250514` | 7.1s | 1389 | High-performance model |
27+
| **Claude 4 Opus** | `claude-opus-4-20250514` | 15.4s | 1411 | Claude's previous flagship model |
28+
| **Claude 4.5 Haiku** | `claude-haiku-4-5-20251001` | 4.6s | 1397 | Claude's fastest and most intelligent Haiku model |
29+
| **Claude 3.5 Haiku** | `claude-3-5-haiku-20241022` | 5.4s | 1320 | Fast and efficient model with strong performance |
30+
| **Claude 3.0 Haiku** | `claude-3-haiku-20240307` | 4.8s | 1260 | Fast and compact model for near-instant responsiveness |
3131

3232
### OpenAI GPT
3333

34-
| Model | Parameter | Description |
35-
| ---------------- | ------------------- | ---------------------------------------------------------------- |
36-
| **GPT-5** | `gpt-5` | OpenAI's best model for coding and agentic tasks across domains |
37-
| **GPT-5 nano** | `gpt-5-nano` | OpenAI's fastest, most cost-efficient version of GPT-5 |
38-
| **GPT-5 mini** | `gpt-5-mini` | A faster, cost-efficient version of GPT-5 for well-defined tasks |
39-
| **GPT-4.1** | `gpt-4.1` | OpenAI's smartest non-reasoning model |
40-
| **ChatGPT-4o** | `chatgpt-4o-latest` | GPT-4o model used in ChatGPT |
41-
| **gpt-oss-120b** | `gpt-oss-120b` | OpenAI's most powerful open-weight model |
42-
| **gpt-oss-20b** | `gpt-oss-20b` | Medium-sized open-weight model for low latency |
34+
| Model | Parameter | Latency per 10,000 tokens | LMArena Score | Description |
35+
| ---------------- | ------------------- | ------------------------- | ------------- | ---------------------------------------------------------------- |
36+
| **GPT-5** | `gpt-5` | 18.9s | 1425 | OpenAI's best model for coding and agentic tasks across domains |
37+
| **GPT-5 nano** | `gpt-5-nano` | 11.2s | 1337 | OpenAI's fastest, most cost-efficient version of GPT-5 |
38+
| **GPT-5 mini** | `gpt-5-mini` | 21.9s | 1395 | A faster, cost-efficient version of GPT-5 for well-defined tasks |
39+
| **GPT-4.1** | `gpt-4.1` | 12.6s | 1411 | OpenAI's smartest non-reasoning model |
40+
| **ChatGPT-4o** | `chatgpt-4o-latest` | 8.0s | 1440 | GPT-4o model used in ChatGPT |
41+
| **gpt-oss-120b** | `gpt-oss-120b` | 10.5s | 1348 | OpenAI's most powerful open-weight model |
42+
| **gpt-oss-20b** | `gpt-oss-20b` | 4.2s | 1317 | Medium-sized open-weight model for low latency |
4343

4444
### Google Gemini
4545

46-
| Model | Parameter | Description |
47-
| ------------------------- | ----------------------- | ------------------------------------------------------------------------------------- |
48-
| **Gemini 2.5 Pro** | `gemini-2.5-pro` | Gemini's state-of-the-art thinking model, capable of reasoning over complex problems |
49-
| **Gemini 2.5 Flash** | `gemini-2.5-flash` | Gemini's best model in terms of price-performance, offering well-rounded capabilities |
50-
| **Gemini 2.5 Flash-Lite** | `gemini-2.5-flash-lite` | Gemini's fastest flash model optimized for cost-efficiency and high throughput |
46+
| Model | Parameter | Latency per 10,000 tokens | LMArena Score | Description |
47+
| ------------------------- | ----------------------- | ------------------------- | ------------- | ------------------------------------------------------------------------------------- |
48+
| **Gemini 2.5 Pro** | `gemini-2.5-pro` | 13.9s | 1451 | Gemini's state-of-the-art thinking model, capable of reasoning over complex problems |
49+
| **Gemini 2.5 Flash** | `gemini-2.5-flash` | 8.3s | 1404 | Gemini's best model in terms of price-performance, offering well-rounded capabilities |
50+
| **Gemini 2.5 Flash-Lite** | `gemini-2.5-flash-lite` | 1.6s | 1374 | Gemini's fastest flash model optimized for cost-efficiency and high throughput |
5151

5252
Unsure which model to choose?
5353

0 commit comments

Comments
 (0)