Open weightsMetasmalltext
Llama 4 8B
Meta's small Llama 4, built for on-device and edge inference.
Updated
Quick facts
- Released
- Apr 2025
- Context
- 128K tokens
- Output / 1M
- Pricing not published
- License
- Llama 4 Community License
About Llama 4 8B
Llama 4 8B is the size class that fits on consumer hardware and edge devices after quantization. It's the practical default for laptop, mobile, and small-VM inference, with the broadest tooling support (Ollama, llama.cpp, LM Studio).
Benchmarks
Published scores from Meta's model card or independent leaderboards. We do not publish numbers we cannot source, see methodology.
HumanEval
,
Python coding pass@1
MMLU
,
Broad academic knowledge
GPQA
,
Graduate-level reasoning
SWE-bench
,
Real software-engineering tasks
Benchmark scores not yet available. We only publish numbers we can source from official model cards or independent leaderboards, see methodology.
Capabilities
Strengths
- Runs on consumer laptops
- Broad tooling support
- Apache-2-adjacent permissiveness for most uses
Tracked weaknesses
- Quality limited by size
- Custom license
Pricing
Per-million-token rates as published by Meta.
Per-token pricing not yet published for Llama 4 8B. Check the official provider site for current tiers.
Call Llama 4 8B from your code
Drop-in snippet for the Meta SDK. Set your API key in the environment and run.
typescript
import OpenAI from "openai";
const client = new OpenAI({
// Use OPENAI_API_KEY for OpenAI, or your provider's key + baseURL.
apiKey: process.env.OPENAI_API_KEY,
});
const completion = await client.chat.completions.create({
model: "llama-4-8b",
messages: [
{ role: "user", content: "What's the time complexity of quicksort?" },
],
});
console.log(completion.choices[0].message.content);Best for
Tasks where Llama 4 8B ranks among LLMDex's top picks.
Compare Llama 4 8B with…
Frequently asked
How much does Llama 4 8B cost?
Meta has not published per-token API pricing for Llama 4 8B at the time of writing. Check the official site for current pricing tiers, or compare against alternative models on LLMDex.What is Llama 4 8B's context window?
Llama 4 8B supports a context window of 128K tokens.Is Llama 4 8B open source?
Llama 4 8B ships with open weights under the Llama 4 Community License license. You can self-host it, fine-tune it, and (subject to the license terms) deploy it commercially.When was Llama 4 8B released?
Llama 4 8B was released on Apr 5, 2025 by Meta.
Friday digest
Intelligence, distilled weekly.
One short email every Friday, new model launches, leaderboard moves, and pricing drops. Curated by hand. Free, no spam.