What is the difference between Huang's and Chollet's definition of AGI?

Huang defines AGI as an AI that can execute complex workflows and run a profitable tech business. François Chollet and the ARC Prize Foundation define AGI as a system that can generalize to completely novel situations without prior training — like a human can. Under Chollet's definition, current AI is 'expensive autocomplete' that pattern-matches on training data but cannot truly generalize.

By Connie · Last reviewed: April 2026 — pricing & tools verified · AI-assisted, human-edited · This article contains affiliate links. We may earn a commission at no extra cost to you if you sign up through our links.

⚠️

Opinion & Speculative Analysis: This article explores a hypothetical scenario or forward-looking analysis, not confirmed news. Events described may not have occurred.

AI Debate

Jensen Huang Says AGI Is Here. A New Benchmark Says It Scored 0%.

Q: What is the ARC-AGI-3 prize?

The ARC Prize Foundation is offering a $2 million prize across three competition tracks for any AI system that can match human performance on ARC-AGI-3's novel environment problem-solving tasks. As of March 2026, no model has come close.

Nvidia's CEO told the world AGI exists. Days later, a benchmark tested every frontier model on novel tasks — and the best AI scored 0.37% versus humans at 100%.

March 31, 2026 · 7 min read · AI Debate

TL;DR

On March 23, Nvidia CEO Jensen Huang declared on Lex Fridman's podcast that artificial general intelligence has been achieved. On March 26 — three days later — the ARC Prize Foundation released ARC-AGI-3, a benchmark testing true generalization on 135 novel environments. Humans scored 100%. Gemini 3.1 Pro scored 0.37%. GPT-5.4 scored 0.26%. Grok-4.20 scored exactly 0%. The definition of AGI is the whole argument.

100%

Humans on ARC-AGI-3

0.37%

Best AI (Gemini 3.1 Pro)

Grok-4.20 score

$2M

Prize for passing

What Jensen Huang Actually Said

On March 23, 2026, Nvidia CEO Jensen Huang appeared on Lex Fridman's podcast and made the boldest claim in the history of the AI industry.

"I think it's now. I think we've achieved AGI."

Jensen Huang, CEO of Nvidia — Lex Fridman Podcast, March 23, 2026

Huang's definition of AGI is functional and business-oriented: an AI that can perform complex multi-step workflows, write production-grade code, and potentially run a tech company to a $1 billion valuation — without constant human oversight at every step. Under that definition, Huang argues, systems like Claude Code, GPT-5.4 with tools, and Grok-4.20 with multi-agent orchestration already qualify.

The claim went viral immediately. CNBC, Forbes, Yahoo Finance, and Fortune all covered the statement within hours. Many in the AI research community were less enthusiastic.

Then ARC-AGI-3 Dropped

Three days after Huang's declaration, François Chollet — the creator of the original ARC-AGI benchmark and co-founder of the ARC Prize Foundation — published ARC-AGI-3. The timing was not a coincidence.

ARC-AGI-3 presents AI systems with 135 novel interactive environments. These are not problems the AI has seen during training. There are no instructions provided. The model must explore, reason, and solve — the way a human encountering a genuinely new situation does. The scoring metric, called Relative Human Action Efficiency (RHAE), also penalizes inefficiency: an AI that takes ten times as many actions as a human earns only 1% for that environment.

To prevent models from gaming the benchmark by training on its data, 110 of the 135 environments are kept private. Only 25 are publicly accessible.

ARC-AGI-3 Scores: Every Frontier Model vs. Humans

System	Score (RHAE)	Gap to Human	Notes
Humans	100%	—	Baseline: generalize to novel situations naturally
Google Gemini 3.1 Pro	0.37%	-99.63%	Highest scoring AI model tested
OpenAI GPT-5.4	0.26%	-99.74%	Huang's implicit benchmark for "AGI"
Anthropic Claude Opus 4.6	0.25%	-99.75%	Anthropic's current flagship model
xAI Grok-4.20	0%	-100%	Zero on every novel environment tested

Why Grok Scored Zero

Grok-4.20's zero score is particularly striking because the model performs well on standard benchmarks that test memorized knowledge. ARC-AGI-3 specifically strips away all prior training advantages. The zero score indicates Grok-4.20 was unable to generalize at all to environments it had never encountered — it could not even begin to explore a novel problem space in a way that matched human action efficiency.

AGI or not — the right AI tools still matter.

Happycapy gives you Claude Opus, GPT-5.4, Gemini 3, Grok, and 150+ models in one place. Compare them side by side on your own tasks — not synthetic benchmarks.

Try Happycapy Free →

Two Definitions of AGI — and Why They Cannot Both Be Right

The fundamental issue is that "AGI" means entirely different things to Huang and Chollet, and the gap between their definitions reveals the biggest fault line in the AI industry.

Huang's definition is pragmatic and market-oriented: AGI means an AI that can execute complex workflows autonomously and create commercial value at scale. Under this framing, current AI already qualifies — Claude Code can write production software, GPT-5.4 can synthesize research, and multi-agent systems are running real business processes. For a CEO of a company that sells AI infrastructure, this definition conveniently validates the product category he is selling.

Chollet's definition is technical and cognitive: AGI is a system that can generalize to genuinely novel situations without prior training, the way any human naturally can. The key word is "generalize." Current AI systems are extraordinarily capable at tasks within their training distribution but fail almost completely when confronted with problems that require true novelty — a capability humans have and machines demonstrably do not.

"If a system cannot generalize to novel situations without instruction, it is expensive autocomplete — not general intelligence."

François Chollet, ARC Prize Foundation co-founder — March 2026

The Corporate Incentive Problem

It is worth noting that Jensen Huang has a direct financial interest in AGI being declared achieved. Nvidia's stock price, revenue projections, and the entire justification for hundreds of billions of dollars in AI infrastructure spending rest on the premise that AI is rapidly approaching human-level capability. A CEO of the world's most valuable semiconductor company declaring AGI arrived validates every data center his company has ever sold.

Yahoo Finance and Fortune both noted this conflict of interest in their coverage of the debate. The Forbes coverage was more neutral — but all three publications ran the ARC-AGI-3 refutation story within 24 hours of Huang's original claim gaining traction.

AGI Definitions: The Spectrum of Views

Who	AGI Definition	AGI Status (Mar 2026)
Jensen Huang (Nvidia)	AI that can run a business, execute complex workflows	ACHIEVED
François Chollet (ARC Prize)	Generalize to novel situations without prior training	NOT ACHIEVED (0.37% best)
Demis Hassabis (Google DeepMind)	AI performing at human expert level across all cognitive tasks	Approaching narrow domains
Dario Amodei (Anthropic)	Vast knowledge, reason and act across complex scientific domains	Within reach (2026–2027 est.)
Yann LeCun (Meta)	Human-level common sense and physical world understanding	Far from achieved (missing world model)

What This Means for AI Users Right Now

For the people actually using AI — not debating its philosophical definitions — the practical answer is clear: today's best models are remarkably capable within their training distribution and genuinely poor at tasks requiring true novelty. You can use Claude, GPT-5.4, and Gemini to write code, summarize documents, analyze data, draft content, and reason through complex problems. These are real, useful capabilities.

What current AI cannot reliably do is encounter a brand-new problem type it has never seen, devise a strategy for exploring it from scratch, and solve it efficiently — the way any human can on any given Tuesday. That 99.63% gap between Gemini's best score and human performance on ARC-AGI-3 is the honest picture of where AI capability stands.

The $2 million ARC Prize is open. No AI system has come close. The prize money is safe for now.

Use the best available AI — not the most hyped.

Happycapy lets you compare Claude, GPT-5.4, Gemini, Grok, and 150+ models on your real tasks. See which one actually performs for your use case — no hype required.

Start Free on Happycapy →

Frequently Asked Questions

Did Jensen Huang say AGI has been achieved?

Yes. Nvidia CEO Jensen Huang stated on a Lex Fridman podcast on March 23, 2026: 'I think it's now. I think we've achieved AGI.' His definition of AGI is an AI that can execute complex workflows and run a company to a $1 billion valuation — not the academic definition requiring generalization to novel tasks.

What is ARC-AGI-3 and what did it show?

ARC-AGI-3 is a benchmark released March 26, 2026 by the ARC Prize Foundation. It tests true generalization across 135 novel interactive environments with no prior training. Humans scored 100%. The best AI model, Google Gemini 3.1 Pro, scored 0.37%. GPT-5.4 scored 0.26%, Claude Opus 4.6 scored 0.25%, and Grok-4.20 scored 0%.

What is the difference between Jensen Huang's and François Chollet's definition of AGI?

Huang defines AGI as AI that executes complex workflows and creates commercial value. Chollet defines AGI as a system that generalizes to genuinely novel situations without prior training — the way humans do. Under Chollet's definition, current AI fails almost completely, scoring less than 0.4% on his benchmark.

What is the ARC-AGI-3 prize?

The ARC Prize Foundation is offering $2 million across three competition tracks for any AI system that matches human performance on ARC-AGI-3. As of March 31, 2026, no model has come close to earning it. The benchmark keeps 110 of 135 environments private to prevent gaming.

Sources: Fortune — Nvidia's Jensen Huang says 'We've achieved AGI.' But no one can agree on what that means · Decrypt — Is AGI Here? Not Even Close, New AI Benchmark Suggests · Winbuzzer — ARC-AGI-3 Offers $2M for AI Matching Human Reasoning · Forbes — Nvidia's Jensen Huang Says He Thinks 'We've Achieved AGI'

← Back to all articles

SharePost on X LinkedIn

—Was this helpful?

Get the best AI tools tips — weekly

Honest reviews, tutorials, and Happycapy tips. No spam.

How-To Guide

How to Use AI for a Pulmonology Practice in 2026: COPD/Asthma, PFT, Sleep, ILD, Lung Cancer Screening & Owner Scorecard

17 min

How-To Guide

How to Use AI for a Hedge Fund in 2026: Idea Generation, Risk, Execution, Compliance Surveillance & Investor Comms

18 min