How much does Llama 3.2 3B cost?

Meta has not published per-token API pricing for Llama 3.2 3B at the time of writing. Check the official site for current pricing tiers, or compare against alternative models on LLMDex.

What is Llama 3.2 3B's context window?

Llama 3.2 3B supports a context window of 128K tokens.

Is Llama 3.2 3B open source?

Llama 3.2 3B ships with open weights under the Llama 3 Community License license. You can self-host it, fine-tune it, and (subject to the license terms) deploy it commercially.

When was Llama 3.2 3B released?

Llama 3.2 3B was released on Sep 25, 2024 by Meta.

How much does Llama 3.2 3B cost?

Meta has not published per-token API pricing for Llama 3.2 3B at the time of writing. Check the official site for current pricing tiers, or compare against alternative models on LLMDex.

What is Llama 3.2 3B's context window?

Llama 3.2 3B supports a context window of 128K tokens.

Is Llama 3.2 3B open source?

Llama 3.2 3B ships with open weights under the Llama 3 Community License license. You can self-host it, fine-tune it, and (subject to the license terms) deploy it commercially.

When was Llama 3.2 3B released?

Llama 3.2 3B was released on Sep 25, 2024 by Meta.

Open weightsMetasmalltext

Llama 3.2 3B

Tiny Llama for mobile and edge, runs comfortably on a phone after quantization.

UpdatedApr 30, 2026

Quick facts

Released: Sep 2024
Context: 128K tokens
Output / 1M: Pricing not published
License: Llama 3 Community License

About Llama 3.2 3B

Llama 3.2 3B was the smallest member of the Llama 3.2 family, designed specifically for on-device inference on phones and embedded systems. Strong tooling support via llama.cpp and MLX, and one of the first small models to support a full 128k context window.

For mobile chat assistants and laptop developer tools, Llama 3.2 3B remains a popular default, particularly when the next-step Phi-4 doesn't fit the deployment constraints.

Benchmarks

Published scores from Meta's model card or independent leaderboards. We do not publish numbers we cannot source, see methodology.

HumanEval

Python coding pass@1

MMLU

Broad academic knowledge

GPQA

Graduate-level reasoning

SWE-bench

Real software-engineering tasks

Benchmark scores not yet available. We only publish numbers we can source from official model cards or independent leaderboards, see methodology.

Capabilities

Strengths

Runs on phones
128k context for its size

Tracked weaknesses

Quality limited

Pricing

Per-million-token rates as published by Meta.

Per-token pricing not yet published for Llama 3.2 3B. Check the official provider site for current tiers.

Call Llama 3.2 3B from your code

Drop-in snippet for the Meta SDK. Set your API key in the environment and run.

typescript

import OpenAI from "openai";

const client = new OpenAI({
  // Use OPENAI_API_KEY for OpenAI, or your provider's key + baseURL.
  apiKey: process.env.OPENAI_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "llama-3-2-3b",
  messages: [
    { role: "user", content: "What's the time complexity of quicksort?" },
  ],
});

console.log(completion.choices[0].message.content);

Compare Llama 3.2 3B with…

Frequently asked

How much does Llama 3.2 3B cost?
Meta has not published per-token API pricing for Llama 3.2 3B at the time of writing. Check the official site for current pricing tiers, or compare against alternative models on LLMDex.
What is Llama 3.2 3B's context window?
Llama 3.2 3B supports a context window of 128K tokens.
Is Llama 3.2 3B open source?
Llama 3.2 3B ships with open weights under the Llama 3 Community License license. You can self-host it, fine-tune it, and (subject to the license terms) deploy it commercially.
When was Llama 3.2 3B released?
Llama 3.2 3B was released on Sep 25, 2024 by Meta.

Friday digest

One short email every Friday, new model launches, leaderboard moves, and pricing drops. Curated by hand. Free, no spam.