All LLMs LLMDex tracks
80 models from 14 providers, sorted by recency within each. Click any row for the full spec sheet.
Updated
OpenAI
11 models| Model | Type | Context | MMLU | Out · 1M | |
|---|---|---|---|---|---|
| GPT-5.5 OpenAI's mid-cycle GPT-5 refresh, improved reasoning, tool use, and multimodal grounding over the 2025 launch. | Proprietary | 400K | , | , | View |
| o4 OpenAI's late-2025 standalone reasoning model, an evolution of o3 with deeper chain-of-thought and stronger multimodal reasoning. | Proprietary | 200K | , | , | View |
| GPT-5 OpenAI's unified flagship combining GPT-line breadth with built-in reasoning, replacing both GPT-4o and the o-series for most users. | Proprietary | 400K | 91.4 | $10.00 | View |
| GPT-5 mini GPT-5's mid-tier sibling, most of the quality at a fraction of the price, ideal for high-volume production workloads. | Proprietary | 400K | , | $2.00 | View |
| GPT-5 nano OpenAI's smallest GPT-5 variant, built for ultra-low-cost classification, routing, and high-volume inference. | Proprietary | 400K | , | $0.40 | View |
| o4-mini Smaller, faster, cheaper member of OpenAI's reasoning-model family, great latency-cost balance for hard tasks. | Proprietary | 200K | , | $4.40 | View |
| o3 OpenAI's flagship reasoning model, set the bar for hard math, GPQA, and agent benchmarks in 2025. | Proprietary | 200K | , | $8.00 | View |
| GPT-4.1 OpenAI's 2025 GPT-4.x refresh, long-context, fast, still widely deployed even after GPT-5. | Proprietary | 1M | 86.2 | $8.00 | View |
| o3-mini Smaller, faster reasoning model, popular as the budget thinking-model option throughout 2025. | Proprietary | 200K | , | $4.40 | View |
| GPT-4o mini GPT-4o's small sibling, defined the cheap-mid-tier slot for most of 2024-2025. | Proprietary | 128K | 82.0 | $0.60 | View |
| GPT-4o OpenAI's first natively-multimodal model, voice, vision, and text in one network. | Proprietary | 128K | 88.7 | $10.00 | View |
Anthropic
8 models| Model | Type | Context | MMLU | Out · 1M | |
|---|---|---|---|---|---|
| Claude Opus 4.7 Anthropic's mid-2026 flagship, ahead on SWE-bench, agent reliability, and writing quality. | Proprietary | 500K | , | , | View |
| Claude Sonnet 4.6 Anthropic's mid-tier 4.6 release, the workhorse model behind most production Anthropic deployments. | Proprietary | 200K | , | , | View |
| Claude Haiku 4 Anthropic's smallest 4-tier model, fast and cheap with the family's signature tone. | Proprietary | 200K | , | , | View |
| Claude Opus 4 Anthropic's mid-2025 flagship, the model that established Claude's lead on coding agents and SWE-bench. | Proprietary | 200K | , | $75.00 | View |
| Claude Sonnet 4 Mid-2025 mid-tier Claude, the predecessor workhorse to Sonnet 4.6 and still common in production. | Proprietary | 200K | , | $15.00 | View |
| Claude 3.7 Sonnet The first Claude with an extended-thinking mode, ushered the reasoning-model paradigm into Anthropic's lineup. | Proprietary | 200K | , | $15.00 | View |
| Claude 3.5 Haiku Late-2024 small Claude, fast and cheap with surprisingly strong code quality. | Proprietary | 200K | , | $4.00 | View |
| Claude 3.5 Sonnet Anthropic's late-2024 mid-tier, set the bar on coding, agents, and tool-use through 2025. | Proprietary | 200K | , | $15.00 | View |
| Model | Type | Context | MMLU | Out · 1M | |
|---|---|---|---|---|---|
| Gemini 3 Pro Google's late-2025 flagship, set new benchmarks on long-context, vision, and reasoning at competitive pricing. | Proprietary | 1.0M | 91.8 | , | View |
| Gemini 3 Flash Google's high-speed, low-cost mid-tier with the same massive context window, popular for high-volume RAG. | Proprietary | 1.0M | , | , | View |
| Gemini 2.5 Flash Mid-2025 fast tier, set the bar for cost-efficient long-context generation. | Proprietary | 1.0M | , | $0.30 | View |
| Gemini 2.5 Pro Google's mid-2025 flagship, the model that brought Gemini decisively back to parity with the OpenAI and Anthropic frontier. | Proprietary | 2.1M | 86.0 | $10.00 | View |
| Gemini 2.0 Flash Early-2025 fast Gemini, first model with full 1M-token context at the Flash price point. | Proprietary | 1.0M | , | $0.40 | View |
| Gemma 2 2B Google's 2B Gemma, built for laptop and phone inference under tight memory budgets. | Open | 8.2K | , | , | View |
| Gemma 2 9B Google's mid-2024 open-weight 9B, strong quality for its size, friendly license. | Open | 8.2K | 71.3 | , | View |
| Gemma 2 27B Larger Gemma 2, competitive with Llama 70B on some benchmarks at half the size. | Open | 8.2K | 75.2 | , | View |
Meta
7 models| Model | Type | Context | MMLU | Out · 1M | |
|---|---|---|---|---|---|
| Llama 4 405B Meta's flagship open-weight model, sparse MoE design competitive with closed-frontier flagships. | Open | 256K | , | , | View |
| Llama 4 70B Meta's mid-tier Llama 4, the practical workhorse for self-hosted deployments. | Open | 128K | , | , | View |
| Llama 4 8B Meta's small Llama 4, built for on-device and edge inference. | Open | 128K | , | , | View |
| Llama 3.3 70B Meta's late-2024 70B refresh, much-improved over 3.1 with better instruction-following and tool-use. | Open | 128K | 86.0 | , | View |
| Llama 3.2 90B Vision Meta's first open-weight vision-language model at 90B parameters. | Open | 128K | , | , | View |
| Llama 3.2 3B Tiny Llama for mobile and edge, runs comfortably on a phone after quantization. | Open | 128K | , | , | View |
| Llama 3.1 405B First open-weight model to match GPT-4-class quality on standard benchmarks. | Open | 128K | 88.6 | , | View |
DeepSeek
3 models| Model | Type | Context | MMLU | Out · 1M | |
|---|---|---|---|---|---|
| DeepSeek-R1 First open-weight reasoning model to match o1, the release that proved RL-from-scratch reasoning training was reproducible. | Open | 128K | , | $2.19 | View |
| DeepSeek-V3 DeepSeek's flagship 671B-parameter MoE, frontier-level quality at a tiny fraction of frontier prices. | Open | 128K | 88.5 | $1.10 | View |
| DeepSeek-Coder-V2 DeepSeek's code-specialized model, strong on a broad set of programming languages and FIM tasks. | Open | 128K | , | , | View |
Alibaba
6 models| Model | Type | Context | MMLU | Out · 1M | |
|---|---|---|---|---|---|
| Qwen3-72B Alibaba's flagship open-weight Qwen3, strong on multilingual, code, and math, Apache-2.0 licensed. | Open | 128K | 84.0 | , | View |
| Qwen3-32B Alibaba's mid-size Qwen3, sweet spot for self-hosting at modest hardware budgets. | Open | 128K | , | , | View |
| Qwen2.5-Coder-32B Open-weight code specialist, frequently the top open option for self-hosted code completion. | Open | 128K | , | , | View |
| Qwen2.5-72B The previous-generation Qwen flagship, still widely deployed for stability. | Open | 128K | 86.0 | , | View |
| Qwen2.5-7B Small Qwen, practical default for laptop and edge inference. | Open | 128K | , | , | View |
| Qwen2-VL-72B Top open-weight vision-language model, strong on document understanding and chart analysis. | Open | 128K | , | , | View |
Mistral
9 models| Model | Type | Context | MMLU | Out · 1M | |
|---|---|---|---|---|---|
| Codestral 2 Mistral's code-specialized model, fast inline completion and strong fill-in-the-middle support. | Open | 256K | , | $0.90 | View |
| Mistral Medium Mistral's mid-tier balanced model, production-ready at competitive pricing. | Proprietary | 128K | , | $2.00 | View |
| Pixtral Large Mistral's 124B vision-language model, strong on dense-text document tasks. | Open | 128K | , | , | View |
| Ministral 8B Mistral's 8B edge model, designed specifically for on-device and on-prem deployment. | Open | 128K | , | , | View |
| Mistral Small Mistral's small-tier API model, fast and cheap for routing and high-volume tasks. | Open | 128K | , | $0.60 | View |
| Pixtral 12B Mistral's 12B multimodal, first vision-capable Apache-2.0 model from the company. | Open | 128K | , | , | View |
| Mistral Large 2 Mistral's flagship API model, strong on code and reasoning, EU-friendly hosting. | Open | 128K | 84.0 | $6.00 | View |
| Mistral Nemo 12B model co-built with Nvidia, strong small-model multilingual performance. | Open | 128K | , | , | View |
| Mixtral 8×22B Mistral's largest open-weight MoE, Apache-2.0, still widely deployed. | Open | 64K | 77.8 | , | View |
xAI
3 models| Model | Type | Context | MMLU | Out · 1M | |
|---|---|---|---|---|---|
| Grok 4 xAI's mid-2025 flagship, top scores on Humanity's Last Exam at launch, with native real-time X integration. | Proprietary | 256K | , | $15.00 | View |
| Grok 3 xAI's first frontier-tier release, established the company's Colossus-trained model line. | Proprietary | 128K | , | $15.00 | View |
| Grok 2 xAI's first widely-available model, free on X for Premium subscribers and competitive with the GPT-4 mid-tier of its era. | Proprietary | 128K | , | , | View |
Microsoft
2 models| Model | Type | Context | MMLU | Out · 1M | |
|---|---|---|---|---|---|
| Phi-4 Microsoft's 14B model, exceptional quality-per-parameter via curated synthetic training data. | Open | 16K | 84.8 | , | View |
| Phi-3.5 Medium 14B Phi-3.5, predecessor to Phi-4 with strong benchmark efficiency for its size. | Open | 128K | 78.9 | , | View |
Cohere
5 models| Model | Type | Context | MMLU | Out · 1M | |
|---|---|---|---|---|---|
| Aya Expanse 32B Cohere's massively multilingual open-weight model, strong on 23 languages. | Open | 128K | , | , | View |
| Aya Expanse 8B Smaller Aya Expanse, multilingual on a single-GPU budget. | Open | 8.2K | , | , | View |
| Command R+ (08-2024) Cohere's flagship optimized for RAG and tool use in enterprise settings. | Open | 128K | 75.7 | $10.00 | View |
| Command R (08-2024) Refreshed Command R with improved tool-use, JSON-mode, and Asian-language support. | Open | 128K | , | $0.60 | View |
| Command R Cohere's mid-tier RAG-optimized model, affordable and reliable on retrieval workloads. | Open | 128K | , | $1.50 | View |
AI21
2 models| Model | Type | Context | MMLU | Out · 1M | |
|---|---|---|---|---|---|
| Jamba 1.5 Large AI21's hybrid SSM-Transformer with a 256k context window, strong on long-doc tasks. | Open | 256K | , | $8.00 | View |
| Jamba 1.5 Mini Smaller hybrid SSM-Transformer model, fast and efficient at long contexts. | Open | 256K | , | $0.40 | View |
Perplexity
2 models| Model | Type | Context | MMLU | Out · 1M | |
|---|---|---|---|---|---|
| Sonar Pro Perplexity's premium answer model, deeper search, more sources, longer answers. | Proprietary | 200K | , | $15.00 | View |
| Sonar Large Perplexity's flagship answer-engine model with built-in web search grounding. | Proprietary | 127K | , | $1.00 | View |
Nvidia
1 model| Model | Type | Context | MMLU | Out · 1M | |
|---|---|---|---|---|---|
| Nemotron-4 340B Nvidia's 340B open-weight model, useful as a synthetic-data generator and benchmark. | Open | 4.1K | 81.1 | , | View |
Other
13 models| Model | Type | Context | MMLU | Out · 1M | |
|---|---|---|---|---|---|
| GLM-4.5 Zhipu AI's flagship, strong open-weight Chinese model with broad commercial deployment. | Open | 128K | , | , | View |
| Reka Flash 3 Reka's 21B reasoning model, Apache-2.0 with extended thinking support. | Open | 32K | , | , | View |
| Granite 3.1 8B IBM's enterprise-tuned open-weight model, Apache-2.0 with extensive code training. | Open | 128K | , | , | View |
| Granite 3.1 2B IBM's smallest Granite, Apache-2.0, edge-friendly, enterprise-supported. | Open | 128K | , | , | View |
| Falcon 3 10B TII's 2024 open-weight refresh, Apache-2.0, multilingual, and competitive at 10B size. | Open | 32K | , | , | View |
| Amazon Nova Pro Amazon's mid-tier multimodal, competitive pricing, deep AWS integration. | Proprietary | 300K | , | $3.20 | View |
| Amazon Nova Lite Amazon's cheap multimodal tier, under-a-cent-per-million-tokens for basic tasks. | Proprietary | 300K | , | $0.24 | View |
| Amazon Nova Micro Amazon's text-only ultra-cheap tier, best for high-volume routing and classification. | Proprietary | 128K | , | $0.14 | View |
| OLMo 2 13B Allen AI's fully-open language model, Apache-2.0, with reproducible training pipeline. | Open | 4.1K | , | , | View |
| SmolLM2 1.7B HuggingFace's tiny model line, punches above its weight on a strict on-device budget. | Open | 8.2K | , | , | View |
| Yi-Lightning 01.AI's API-tier Chinese-leaning model, strong on Chinese benchmarks at competitive pricing. | Proprietary | 16K | , | $0.14 | View |
| Molmo 72B Allen AI's vision-language model, open everything (weights, data, training code). | Open | 4K | , | , | View |
| DBRX Databricks' 132B MoE, a notable 2024 open-weight release tuned for enterprise. | Open | 32K | 73.7 | , | View |
Don't see a model you expected? Email add@llmdex.com or open an issue. New launches are usually added within seven days. Read our methodology.