HappycapyGuide

By Connie · Last reviewed: April 2026 — pricing & tools verified · This article contains affiliate links. We may earn a commission at no extra cost to you if you sign up through our links.

← Blog
Model Release

Mistral Small 4: The Open-Source Model That Unifies Reasoning, Vision and Coding

March 16, 2026  ·  Happycapy Editorial

TL;DR
Mistral released Small 4 on March 16, 2026 — a 119B Mixture-of-Experts model under Apache 2.0. It replaces four separate Mistral models (Magistral for reasoning, Pixtral for vision, Devstral for coding, Small 3 for general chat) with a single checkpoint. Only 6.5B parameters activate per token. 256K context. 40% lower latency than Small 3. Self-host for free.

Open-source AI just had a milestone release. Mistral Small 4 is the first model in its performance class to unify deep reasoning, vision understanding, and agentic coding under a single Apache 2.0 license — with no restrictions on commercial use, no royalties, and no vendor lock-in.

The model's architecture is a 119-billion-parameter Mixture-of-Experts (MoE) system. But the headline figure is misleading in the best way: only 6.5 billion parameters activate per token, using 128 experts with 4 active per inference. The result is GPT-4o-class performance at a fraction of the compute cost.

What Makes It Different

Previously, teams using Mistral had to maintain separate models for different tasks — Magistral for reasoning, Pixtral for vision analysis, Devstral for code generation. Small 4 collapses all four into one unified checkpoint. The model exposes a reasoning_effort parameter that lets developers dial reasoning depth from fast (no chain-of-thought) to deep (extended internal monologue), without switching models.

This matters for agentic workflows. An agent that needs to read a screenshot, reason about it, and write code can now do all three in a single model call, with full context continuity across modalities.

Benchmark Comparison

ModelGPQA ScoreParams (Active)ContextLicense
Mistral Small 40.706.5B / 119B MoE256KApache 2.0
GPT-4o0.74~200B (est)128KProprietary
Claude Sonnet 4.60.72Undisclosed200KProprietary
Gemma 4 31B0.8031B dense256KGemma License
Llama 4 Maverick0.7417B / 400B MoE1MLlama License

Mistral Small 4 posts a GPQA score of 0.70, landing in the same tier as Claude Sonnet 4.6 and within striking distance of GPT-4o. It trails Gemma 4 31B, which leads this weight class. But Gemma 4's license restricts large-scale commercial use without Google approval — Apache 2.0 carries no such restriction.

Speed and Efficiency

Compared to Mistral Small 3, Small 4 delivers a 40% reduction in end-to-end latency and 3x higher throughput on the same hardware. This is a result of the MoE routing: most tokens route through fast, lightweight experts, with deep reasoning experts activating only when the reasoning_effort parameter demands it.

For production deployments, Mistral estimates costs will fall between Mistral Small 3.1 ($0.10–$0.20/M tokens) and Mistral Medium 3.1 ($0.40/M tokens) via the Mistral API. Self-hosted costs depend only on your hardware.

Want multi-model access with zero setup?
Happycapy Pro gives you Claude, GPT, Gemini, and Mistral — one subscription, one interface. No API keys, no infrastructure.
Try Happycapy Free →

Availability and Deployment

When to Use Mistral Small 4 vs Proprietary Models

Use CaseMistral Small 4GPT-4o / Claude Sonnet
Data sovereignty required (GDPR, HIPAA)Best choice — fully self-hostedData leaves your infrastructure
Fine-tuning for custom domainApache 2.0 permits full fine-tuningNot permitted without enterprise agreements
High-volume agentic pipelines3x throughput vs Small 3; self-hosted = zero per-token costPer-token cost accumulates at scale
Latest safety guardrails and alignmentGood but open weights can be uncensoredAnthropic / OpenAI manage alignment continuously
Cutting-edge benchmark performance0.70 GPQA — competitive but not topGPT-5.4 series leads overall

What This Means for the Open-Source AI Landscape

Mistral Small 4 is the clearest evidence yet that the gap between open-source and proprietary models has collapsed at the "small" tier. Two years ago, open-source models required 70B+ parameters to match GPT-3.5 class performance. Small 4 activates 6.5B parameters to match GPT-4o-class performance.

The model also puts pressure on Meta's Llama ecosystem and Google's Gemma 4. Llama 4 Maverick offers 1M context and an Apache-compatible license but requires 4x the hardware. Gemma 4 31B posts higher GPQA scores but carries license restrictions. Mistral Small 4 occupies a unique position: genuinely permissive, genuinely capable, genuinely efficient.

For teams building AI-native applications in 2026, Mistral Small 4 is the new default starting point for open-source deployments.

Access the best AI models — including Mistral — in one place
Happycapy Pro ($17/month) gives you Claude Opus, GPT-5.4, Gemini 3.1 Pro, and more with no API keys required. Compare models side-by-side on your own tasks.
Start Free on Happycapy →

Frequently Asked Questions

What is Mistral Small 4?
Mistral Small 4 is a 119-billion-parameter Mixture-of-Experts model released March 16, 2026 under the Apache 2.0 license. It activates only 6.5B parameters per token and unifies reasoning, vision, and agentic coding in a single model checkpoint.

Is Mistral Small 4 free to use commercially?
Yes. The Apache 2.0 license permits commercial use, modification, redistribution, and self-hosting with no licensing fees. This makes it the most permissive frontier-class model available in March 2026.

How does Mistral Small 4 compare to GPT-4o?
Mistral Small 4 scores 0.70 on GPQA vs GPT-4o's 0.74. It delivers 40% lower latency and 3x higher throughput on the same hardware. GPT-4o has stronger safety guardrails and is managed by OpenAI; Small 4 gives you full control of the weights.

What context window does Mistral Small 4 support?
256,000 tokens for full deployments. 128K tokens for edge and constrained deployments. This enables long-document analysis and extended multi-turn agentic workflows without truncation.

Sources: Mistral AI — Introducing Mistral Small 4 (March 16, 2026) · VentureBeat — Mistral's Small 4 Consolidates Reasoning, Vision and Coding · Medium — Mistral Small 4: The Open-Source Model (March 2026)
SharePost on XLinkedIn
Was this helpful?

Get the best AI tools tips — weekly

Honest reviews, tutorials, and Happycapy tips. No spam.

Comments