@@ -20,34 +20,34 @@ The LLM Gateway provides access to 15+ models across major AI providers with sup
2020
2121### Anthropic Claude
2222
23- | Model | Parameter | Description |
24- | --------------------- | ---------------------------- | ------------------------------------------------------ |
25- | ** Claude 4.5 Sonnet** | ` claude-sonnet-4-5-20250929 ` | Claude's best model for complex agents and coding |
26- | ** Claude 4 Sonnet** | ` claude-sonnet-4-20250514 ` | High-performance model |
27- | ** Claude 4 Opus** | ` claude-opus-4-20250514 ` | Claude's previous flagship model |
28- | ** Claude 4.5 Haiku** | ` claude-haiku-4-5-20251001 ` | Claude's fastest and most intelligent Haiku model |
29- | ** Claude 3.5 Haiku** | ` claude-3-5-haiku-20241022 ` | Claude's fastest model |
30- | ** Claude 3.0 Haiku** | ` claude-3-haiku-20240307 ` | Fast and compact model for near-instant responsiveness |
23+ | Model | Parameter | Latency per 10,000 tokens | LMArena Score | Description |
24+ | --------------------- | ---------------------------- | ------------------------- | ------------- | ------------------------- ----------------------------- |
25+ | ** Claude 4.5 Sonnet** | ` claude-sonnet-4-5-20250929 ` | 10.1s | 1438 | Claude's best model for complex agents and coding |
26+ | ** Claude 4 Sonnet** | ` claude-sonnet-4-20250514 ` | 7.1s | 1389 | High-performance model |
27+ | ** Claude 4 Opus** | ` claude-opus-4-20250514 ` | 15.4s | 1411 | Claude's previous flagship model |
28+ | ** Claude 4.5 Haiku** | ` claude-haiku-4-5-20251001 ` | 4.6s | 1397 | Claude's fastest and most intelligent Haiku model |
29+ | ** Claude 3.5 Haiku** | ` claude-3-5-haiku-20241022 ` | 5.4s | 1320 | Fast and efficient model with strong performance |
30+ | ** Claude 3.0 Haiku** | ` claude-3-haiku-20240307 ` | 4.8s | 1260 | Fast and compact model for near-instant responsiveness |
3131
3232### OpenAI GPT
3333
34- | Model | Parameter | Description |
35- | ---------------- | ------------------- | ---------------------------------------------------------------- |
36- | ** GPT-5** | ` gpt-5 ` | OpenAI's best model for coding and agentic tasks across domains |
37- | ** GPT-5 nano** | ` gpt-5-nano ` | OpenAI's fastest, most cost-efficient version of GPT-5 |
38- | ** GPT-5 mini** | ` gpt-5-mini ` | A faster, cost-efficient version of GPT-5 for well-defined tasks |
39- | ** GPT-4.1** | ` gpt-4.1 ` | OpenAI's smartest non-reasoning model |
40- | ** ChatGPT-4o** | ` chatgpt-4o-latest ` | GPT-4o model used in ChatGPT |
41- | ** gpt-oss-120b** | ` gpt-oss-120b ` | OpenAI's most powerful open-weight model |
42- | ** gpt-oss-20b** | ` gpt-oss-20b ` | Medium-sized open-weight model for low latency |
34+ | Model | Parameter | Latency per 10,000 tokens | LMArena Score | Description |
35+ | ---------------- | ------------------- | ------------------------- | ------------- | ------------------------- --------------------------------------- |
36+ | ** GPT-5** | ` gpt-5 ` | 18.9s | 1425 | OpenAI's best model for coding and agentic tasks across domains |
37+ | ** GPT-5 nano** | ` gpt-5-nano ` | 11.2s | 1337 | OpenAI's fastest, most cost-efficient version of GPT-5 |
38+ | ** GPT-5 mini** | ` gpt-5-mini ` | 21.9s | 1395 | A faster, cost-efficient version of GPT-5 for well-defined tasks |
39+ | ** GPT-4.1** | ` gpt-4.1 ` | 12.6s | 1411 | OpenAI's smartest non-reasoning model |
40+ | ** ChatGPT-4o** | ` chatgpt-4o-latest ` | 8.0s | 1440 | GPT-4o model used in ChatGPT |
41+ | ** gpt-oss-120b** | ` gpt-oss-120b ` | 10.5s | 1348 | OpenAI's most powerful open-weight model |
42+ | ** gpt-oss-20b** | ` gpt-oss-20b ` | 4.2s | 1317 | Medium-sized open-weight model for low latency |
4343
4444### Google Gemini
4545
46- | Model | Parameter | Description |
47- | ------------------------- | ----------------------- | ------------------------------------------------------------------------------------- |
48- | ** Gemini 2.5 Pro** | ` gemini-2.5-pro ` | Gemini's state-of-the-art thinking model, capable of reasoning over complex problems |
49- | ** Gemini 2.5 Flash** | ` gemini-2.5-flash ` | Gemini's best model in terms of price-performance, offering well-rounded capabilities |
50- | ** Gemini 2.5 Flash-Lite** | ` gemini-2.5-flash-lite ` | Gemini's fastest flash model optimized for cost-efficiency and high throughput |
46+ | Model | Parameter | Latency per 10,000 tokens | LMArena Score | Description |
47+ | ------------------------- | ----------------------- | ------------------------- | ------------- | ------------------------- ------------------------------------------------------------ |
48+ | ** Gemini 2.5 Pro** | ` gemini-2.5-pro ` | 13.9s | 1451 | Gemini's state-of-the-art thinking model, capable of reasoning over complex problems |
49+ | ** Gemini 2.5 Flash** | ` gemini-2.5-flash ` | 8.3s | 1404 | Gemini's best model in terms of price-performance, offering well-rounded capabilities |
50+ | ** Gemini 2.5 Flash-Lite** | ` gemini-2.5-flash-lite ` | 1.6s | 1374 | Gemini's fastest flash model optimized for cost-efficiency and high throughput |
5151
5252Unsure which model to choose?
5353
0 commit comments