LLM·Dex
Open weightsAlibabaopentext

Qwen3-32B

Alibaba's mid-size Qwen3, sweet spot for self-hosting at modest hardware budgets.

Updated


Quick facts

Released
Apr 2025
Context
128K tokens
Output / 1M
Pricing not published
License
Apache-2.0

About Qwen3-32B

Qwen3-32B is the size sweet spot for self-hosting: fits on 2× 24GB GPUs after quantization, clears the quality bar for production use, and inherits the Qwen3 line's strong multilingual training.

For teams wanting open-weight quality without committing to the cost of running 70B-class models, Qwen3-32B has become the de facto default in Chinese, multilingual, and cost-sensitive English deployments.

Benchmarks

Published scores from Alibaba's model card or independent leaderboards. We do not publish numbers we cannot source, see methodology.

HumanEval
,
Python coding pass@1
MMLU
,
Broad academic knowledge
GPQA
,
Graduate-level reasoning
SWE-bench
,
Real software-engineering tasks
Benchmark scores not yet available. We only publish numbers we can source from official model cards or independent leaderboards, see methodology.

Capabilities

Strengths

  • Apache-2.0
  • Fits modest hardware budgets

Tracked weaknesses

  • Trails 72B on hardest tasks

Pricing

Per-million-token rates as published by Alibaba.

Per-token pricing not yet published for Qwen3-32B. Check the official provider site for current tiers.

Call Qwen3-32B from your code

Drop-in snippet for the Alibaba SDK. Set your API key in the environment and run.

typescript
import OpenAI from "openai";

const client = new OpenAI({
  // Use OPENAI_API_KEY for OpenAI, or your provider's key + baseURL.
  apiKey: process.env.OPENAI_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "qwen-3-32b",
  messages: [
    { role: "user", content: "What's the time complexity of quicksort?" },
  ],
});

console.log(completion.choices[0].message.content);

Best for

Tasks where Qwen3-32B ranks among LLMDex's top picks.

Compare Qwen3-32B with…

Frequently asked

  • How much does Qwen3-32B cost?
    Alibaba has not published per-token API pricing for Qwen3-32B at the time of writing. Check the official site for current pricing tiers, or compare against alternative models on LLMDex.
  • What is Qwen3-32B's context window?
    Qwen3-32B supports a context window of 128K tokens.
  • Is Qwen3-32B open source?
    Qwen3-32B ships with open weights under the Apache-2.0 license. You can self-host it, fine-tune it, and (subject to the license terms) deploy it commercially.
  • When was Qwen3-32B released?
    Qwen3-32B was released on Apr 29, 2025 by Alibaba.
Friday digest

Intelligence, distilled weekly.

One short email every Friday, new model launches, leaderboard moves, and pricing drops. Curated by hand. Free, no spam.