What are Microsoft MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2?

These are three proprietary AI foundational models launched by Microsoft on April 2, 2026. MAI-Transcribe-1 is a speech-to-text model supporting 25 languages at 2.5x faster speed than Azure Fast, priced at $0.36/hr. MAI-Voice-1 generates 60 seconds of natural audio in 1 second with custom voice cloning, priced at $22/1M characters. MAI-Image-2 generates images at 2x faster speed than its predecessor and is rolling out to Microsoft Foundry, Bing, and PowerPoint.

What is Mustafa Suleyman's 'humanist superintelligence' vision?

Humanist superintelligence, as defined by Suleyman, refers to AI systems that are intellectually equivalent to or exceed humans but remain strictly under human control and aligned with human interests. He says Microsoft will abandon any AI system that risks 'running away' or operating autonomously without containment. The focus is enterprise applications: healthcare diagnostics, productivity tools, and scientific breakthroughs in clean energy.

By Connie · Last reviewed: April 2026 — pricing & tools verified · AI-assisted, human-edited · This article contains affiliate links. We may earn a commission at no extra cost to you if you sign up through our links.

Enterprise AI

Microsoft Breaks From OpenAI: 3 In-House AI Models Launched, Suleyman Reveals 'Humanist Superintelligence' Plan

Q: How does Microsoft's renegotiated OpenAI contract work?

The new agreement, running through 2032, grants Microsoft the legal right to use OpenAI's intellectual property to develop its own AGI and compete independently. Previously, Microsoft had forfeited these rights in exchange for access to OpenAI's products. The renegotiation was what Suleyman called 'unlocked Microsoft's ability to pursue superintelligence' — Microsoft is now both a partner and a potential competitor to OpenAI.

Q: Is Microsoft still partnered with OpenAI?

Yes. The Microsoft-OpenAI partnership extends through 2032. However, the renegotiated contract gives Microsoft new rights to build competing frontier models and pursue AGI independently. Microsoft's MAI team (led by Suleyman) is developing in-house models while Copilot and Azure OpenAI continue to use GPT models. The relationship has evolved from dependency to a partner-competitor dynamic.

Breaking — April 2, 2026 · The Verge / TechCrunch / VentureBeat

Microsoft launched three proprietary AI models on April 2, 2026 — MAI-Transcribe-1 (speech-to-text, 25 languages), MAI-Voice-1 (60s audio in 1s), and MAI-Image-2 (2x faster image generation). In a Verge interview, Mustafa Suleyman revealed that a renegotiated OpenAI contract 'unlocked Microsoft's ability to pursue superintelligence' and that he had been planning this pivot for nine months. Full model breakdown, pricing, benchmarks, and what it means for the future of Copilot.

By Connie · April 2026 · 9 min read

Home / Blog / Microsoft MAI Models & Superintelligence

TL;DR

On April 2, 2026, Microsoft launched three proprietary in-house AI models under its MAI (Microsoft AI) brand: MAI-Transcribe-1 (speech-to-text, 25 languages, $0.36/hr), MAI-Voice-1 (text-to-speech, 1-second latency, $22/1M chars), and MAI-Image-2 (image generation, 2x speed, rolling to Bing/PowerPoint). In a Verge interview published the same day, Microsoft AI CEO Mustafa Suleyman revealed a renegotiated OpenAI contract gives Microsoft the right to pursue superintelligence independently. Suleyman calls the vision "humanist superintelligence" — AI that exceeds human intelligence but stays under human control.

3New MAI models launched

25Languages supported

2.5×Faster than Azure Fast

2032New OpenAI contract end

The Three MAI Models: Full Breakdown

Microsoft's MAI Superintelligence team, formed six months ago under Mustafa Suleyman, shipped its first wave of foundational models on April 2, 2026. These are fully in-house — not fine-tuned OpenAI models. They are available immediately on Microsoft Foundry and the new MAI Playground.

MAI-Transcribe-1

Speech-to-Text · Voice Recognition

Microsoft's speech-to-text model supports 25 languages and achieves a 3.8% average Word Error Rate (WER) on the FLEURS benchmark — beating OpenAI's Whisper-large-v3 across all tested languages. Speed is 2.5x faster than Microsoft's previous Azure Fast offering.

Speed vs. Azure Fast

2.5× faster

Word Error Rate (FLEURS)

3.8% avg WER

Languages

25 languages

Pricing

$0.36 / hour

MAI-Voice-1

Text-to-Speech · Audio Generation

MAI-Voice-1 generates 60 seconds of natural-sounding audio in just one second of compute time. It supports custom voice creation from short audio samples and maintains speaker identity across long-form content — a key capability for enterprise audio workflows like call center responses, narration, and voice cloning.

Generation speed

60s audio in 1s

Custom voice

From short samples

Long-form ID preservation

Yes

Pricing

$22 / 1M chars

MAI-Image-2

Image Generation · Visual AI

MAI-Image-2 debuted on the MAI Playground in March 2026 and is now rolling out to Microsoft Foundry, Bing Image Creator, and PowerPoint Designer. It delivers at least 2x faster generation than its predecessor and targets both creative professionals and enterprise users building visual content at scale.

Speed vs. predecessor

2× faster

Rollout

Foundry, Bing, PowerPoint

Text input pricing

$5 / 1M tokens

Image output pricing

$33 / 1M tokens

The Suleyman Interview: Nine Months of Preparation

The model launches landed alongside a major Verge interview with Mustafa Suleyman, published April 2, 2026. In it, Suleyman revealed that Microsoft's strategic pivot toward independent frontier AI had been in motion for nine months — far longer than the public announcement of the MAI team restructuring in mid-March 2026 suggested.

"Renegotiating Microsoft's contract with OpenAI is the thing that officially 'unlocked [Microsoft's] ability to pursue superintelligence,' but I'd been planning even before the ink was dry."

— Mustafa Suleyman, The Verge, April 2, 2026

The key detail: Microsoft's previous contract with OpenAI had contained provisions that barred Microsoft from independently pursuing AGI. The renegotiated agreement — which runs through 2032 — explicitly grants Microsoft the legal right to use OpenAI's intellectual property to train competing models and pursue superintelligence on its own timeline.

This is a structural transformation of the Microsoft-OpenAI relationship. For the past four years, Microsoft functioned as OpenAI's infrastructure and distribution partner — it provided compute via Azure, distributed GPT models through Copilot and Azure OpenAI Service, and in exchange received access to leading models. That arrangement is now layered with a parallel track where Microsoft is actively building to compete.

What "Humanist Superintelligence" Actually Means

Suleyman's framing of "humanist superintelligence" is a deliberate contrast to Sam Altman's "superintelligence within a few thousand days" framing and Elon Musk's "Truth-seeking Grok" positioning. Here is how Microsoft's vision differs from its competitors:

Company	Superintelligence framing	Control philosophy	Primary focus
Microsoft (Suleyman)	Humanist superintelligence — exceeds humans, stays controllable	Will abandon any system that risks "running away"	Enterprise: healthcare, productivity, clean energy
OpenAI (Altman)	"Superintelligence within a few thousand days"	Alignment research (RLHF, Constitution AI)	Consumer + enterprise, AGI mission
Anthropic (Amodei)	Transformative AI with safety constraints first	Constitutional AI, Responsible Scaling Policy	Safety-first, enterprise B2B
xAI (Musk)	Truth-seeking AGI "to understand the universe"	Minimal restriction, open reasoning	Consumer, science, SpaceX integration

Suleyman's explicit willingness to "abandon any AI system that risks running away" is a significant commitment from one of the most powerful AI executives in the world. He is essentially building a public accountability mechanism into Microsoft's AI strategy — a hedge against both catastrophic outcomes and regulatory backlash.

Why the enterprise focus matters

Microsoft's enterprise focus (healthcare diagnostics, productivity tools, clean energy science) is not just a strategic choice — it is a defensible moat. Enterprise AI buyers care about compliance, data governance, and auditability above raw benchmark scores. Microsoft's existing relationships with 300,000+ enterprise customers, Azure infrastructure, and regulatory credibility give MAI models a distribution advantage that pure-AI-labs cannot easily replicate.

Stop relying on a single AI provider

As Microsoft, Google, Anthropic, and OpenAI all launch competing frontier models, Happycapy routes your work to whichever model performs best for your task — automatically.

Try Happycapy Free

Microsoft vs. OpenAI: Partner or Competitor?

The relationship is now both simultaneously. Microsoft remains OpenAI's largest investor and infrastructure partner — Azure runs OpenAI's training clusters, and the partnership extends to 2032. But the new contract terms create a dynamic where Microsoft is legally authorized to train competing frontier models and potentially release a competing product to ChatGPT.

For context: Microsoft's Copilot product (its consumer AI) currently runs on GPT-5.4. The MAI models launched today are the first proprietary Microsoft models — they cover transcription, voice, and image generation, which are adjacent to but do not yet directly compete with GPT's core text/reasoning capabilities. The next phase — if MAI builds a large language model — would be a direct competitive move.

The market implications for enterprise buyers are significant. Azure OpenAI Service customers now have a Microsoft-native alternative for specific workloads (transcription, voice, image generation) that may be more cost-effective and have tighter data governance guarantees than routing data through the OpenAI API.

MAI vs. Competitors: Model-by-Model Comparison

Model	Microsoft MAI	Closest Competitor	Price Advantage
Speech-to-text	MAI-Transcribe-1 · 3.8% WER · $0.36/hr	OpenAI Whisper-large-v3 · higher WER · similar pricing	Better accuracy, 2.5× faster
Text-to-speech	MAI-Voice-1 · 60s/1s · $22/1M chars	ElevenLabs · $22/1M chars · lower latency options	Comparable price, superior Azure integration
Image generation	MAI-Image-2 · $33/1M img tokens	DALL-E 3 via OpenAI · similar price band	2× faster, native Bing/PowerPoint distribution

What This Means for Copilot and Azure Customers

For the 300,000+ enterprise customers using Microsoft 365 Copilot, these launches signal a multi-year roadmap where Microsoft increasingly substitutes in-house models for OpenAI models in specific tasks — starting with modalities (voice, image, transcription) and eventually moving toward reasoning and text generation.

PowerPoint Designer will gain MAI-Image-2 for faster AI-generated visuals in presentations
Microsoft Teams transcription will migrate to MAI-Transcribe-1, with better accuracy and lower cost
Azure AI Foundry customers can access all three MAI models via API today
Bing Image Creator transitions to MAI-Image-2, enabling Microsoft to control the full image generation pipeline for its search product

One platform, every frontier model

Whether it's Microsoft MAI, Claude, GPT-5.5, or Gemini — Happycapy integrates all frontier AI in one workspace. No switching tabs. No re-prompting.

Start Free — No Credit Card

Frequently Asked Questions

What are MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2?

Three proprietary AI foundational models launched by Microsoft on April 2, 2026. MAI-Transcribe-1 is speech-to-text (25 languages, 3.8% WER, $0.36/hr). MAI-Voice-1 is text-to-speech (60s audio in 1s, custom voice, $22/1M chars). MAI-Image-2 is image generation (2x faster, rolling to Foundry/Bing/PowerPoint, $5/1M text input, $33/1M image output).

How does Microsoft's renegotiated OpenAI contract work?

The new agreement runs through 2032 and grants Microsoft the legal right to use OpenAI's IP to develop competing frontier models and pursue AGI independently. Previously, Microsoft had forfeited these rights in exchange for access to OpenAI models. Microsoft is now both a partner (Azure runs OpenAI infrastructure, Copilot uses GPT models) and a potential competitor (MAI team building independent frontier AI).

What is "humanist superintelligence"?

Mustafa Suleyman's term for AI systems that intellectually match or exceed humans but remain strictly under human control and aligned with human interests. He explicitly states Microsoft will abandon any system that risks "running away" or operating autonomously without containment. The enterprise focus covers healthcare diagnostics, productivity tools, and clean energy science.

Is Microsoft still partnered with OpenAI?

Yes, through 2032. Copilot and Azure OpenAI Service continue to use GPT models. But the renegotiated contract gives Microsoft new rights to build competing frontier models. The relationship has evolved from pure dependency to a partner-competitor dynamic — similar to how Google and Samsung co-develop Android while Samsung also develops its own Exynos chips.

Sources

The Verge — "Microsoft's new 'superintelligence' game plan is all about business" (April 2, 2026)

TechCrunch — "Microsoft takes on AI rivals with three new foundational models" (April 2, 2026)

VentureBeat — "Microsoft launches 3 new AI models in direct shot at OpenAI and Google" (April 2, 2026)

The Tech Portal — "Microsoft rolls out 3 new AI models for transcription, voice, and image generation" (April 2, 2026)

TechBuzz.ai — "Microsoft's MAI unveils trio of foundational AI models" (April 2, 2026)

Enterprise AIMicrosoftModel LaunchIndustry News

Microsoft Copilot Critique: GPT Drafts, Claude Reviews — The Multi-Model Enterprise AI Shift OpenAI GPT-5.5 'Spud' Finishes Pretraining — Massive Leap Toward AGI GPT-5.4 vs Claude Opus 4.6 vs Gemini 3.1 Pro: Best AI in April 2026?

Sources

Microsoft AI Blog· VentureBeat — Microsoft AI· Happycapy AI Official

SharePost on X LinkedIn

—Was this helpful?

Get the best AI tools tips — weekly

Honest reviews, tutorials, and Happycapy tips. No spam.

Enterprise AI

Microsoft Dynamics 365 Wave 1: Agentic AI Goes GA Across Every Business App

5 min

Enterprise AI

Arcee Trinity-Large-Thinking: The 398B Apache 2.0 Reasoning Model 96% Cheaper Than Claude Opus

8 min

Enterprise AI

Amazon OpenSearch Gets Agentic AI: Plan-Execute-Reflect Investigation Agent

7 min

Enterprise AI

Cisco Secures Agentic AI at RSA 2026: Zero Trust for Non-Human Identities

8 min

Comments

Microsoft Breaks From OpenAI: 3 In-House AI Models Launched, Suleyman Reveals 'Humanist Superintelligence' Plan

The Three MAI Models: Full Breakdown

The Suleyman Interview: Nine Months of Preparation

What "Humanist Superintelligence" Actually Means

Microsoft vs. OpenAI: Partner or Competitor?

MAI vs. Competitors: Model-by-Model Comparison

What This Means for Copilot and Azure Customers

Frequently Asked Questions

You might also like