HappycapyGuide

By Connie · This article contains affiliate links. We may earn a commission at no extra cost to you if you sign up through our links.

AI HardwareApril 4, 2026 · 8 min read

xAI Colossus 2: World's First 1.5 GW AI Supercluster Is Now Training Grok 5

Elon Musk's xAI has completed the expansion of Colossus 2 to 1.5 gigawatts — making it the largest AI training cluster ever built by a wide margin. Here's what's inside, what it's training, and what it means for the broader compute race.

TL;DR

  • Colossus 2 reached 1.5 GW in April 2026 — world's first at this scale
  • 850,000 NVIDIA GPUs across three buildings in Memphis, Tennessee
  • Primary purpose: training Grok 5 (est. 6 trillion parameters, MoE)
  • Power draw exceeds San Francisco's entire city peak load
  • OpenAI and Anthropic are targeting comparable scale in 2027+

From 1 GW to 1.5 GW: The Timeline

Colossus 2 went live in January 2026 as the world's first gigawatt-scale AI training cluster. At launch, xAI's facility housed roughly 600,000 NVIDIA GPUs and used more power than any previous AI data center by a factor of three.

At the January launch event, Elon Musk announced the roadmap publicly: "Upgrading to 1.5 GW in April." That target has now been hit. The expansion added approximately 250,000 additional GPUs and a third warehouse building — internally codenamed "MACROHARDRR" — to the existing Memphis campus.

The long-term roadmap points to nearly 2 GW of total capacity once all infrastructure work is complete, though xAI has not yet given a specific date for that milestone.

Inside the 1.5 GW Cluster

SpecColossus 2 (April 2026)
Total GPU count~850,000 NVIDIA GPUs
Power capacity1.5 GW (target); 350 MW confirmed cooling
LocationMemphis, Tennessee
Buildings3 warehouses + adjacent land ("MACROHARDRR")
Power vs. cityExceeds San Francisco peak load
Primary workloadGrok 5 training + inference
Long-term target~2 GW (no date set)
LaunchedJanuary 2026 (1 GW), April 2026 (1.5 GW)

Note: Independent satellite imagery analysis by Tom's Hardware suggested cooling infrastructure may currently support closer to 350 MW of actual workload, despite the official 1 GW+ claims. xAI maintains the cluster is fully operational at stated capacity.

What Colossus 2 Is Training: Grok 5

The primary purpose of the 1.5 GW expansion is training Grok 5 — xAI's next flagship model, speculated to use a Mixture-of-Experts architecture with approximately 6 trillion total parameters. If accurate, that would make it one of the largest models ever trained by raw parameter count, rivaling GPT-5.5 "Spud" and the still-unannounced Claude Mythos.

xAI has not confirmed Grok 5's parameter count publicly, but multiple sources cite the 6T figure based on internal hiring documentation and infrastructure planning documents that surfaced earlier in 2026.

ModelDeveloperEst. ParamsStatusCompute Source
Grok 5xAI~6T (MoE)TrainingColossus 2
GPT-5.5 "Spud"OpenAI1–10T (MoE)Pretraining doneStargate Abilene
Claude MythosAnthropic~10T (speculated)Early accessUndisclosed
Gemini 4 UltraGoogle DeepMindUndisclosedRumored H2 2026TPU v7 pods

The Compute Race: How xAI Compares

Before Colossus 2, the largest AI training clusters operated at 100–200 MW. OpenAI's Stargate Abilene — the cluster training GPT-5.5 — is estimated at around 500 MW, making xAI's facility three times larger by power consumption.

ClusterOperatorCapacityTimeline
Colossus 2xAI1.5 GWApril 2026 (live)
Stargate AbileneOpenAI / Oracle~500 MW2026 (live)
Microsoft AI CampusMicrosoft / OpenAI~400 MW2026 (partial)
Google TPU v7 PodsGoogle DeepMind~600 MW (distributed)2026
OpenAI / Anthropic GW clustersVarious1 GW target2027 or later

The scale advantage is significant: more compute translates directly into the ability to train larger models faster, run more ablation experiments, and keep inference costs lower through economies of scale. If xAI can successfully utilize its 1.5 GW build-out, it has a structural training advantage that competitors won't match for at least 12–18 months.

The Cooling Controversy

Not everyone accepts xAI's power claims at face value. In February 2026, Tom's Hardware published an analysis of satellite imagery showing that Colossus 2's visible cooling towers and infrastructure appeared consistent with approximately 350 MW of actual computational load — not the claimed 1 GW+.

xAI disputed the analysis, arguing that the thermal signature alone is not sufficient to determine compute density, particularly given the use of liquid cooling and high rack density configurations that reduce external heat rejection per GPU.

The controversy highlights a recurring pattern in AI infrastructure: companies announce capacity in power-contracted terms while actual computational utilization ramps up over months. The distinction matters when evaluating competitive claims about training timelines for models like Grok 5.

What This Means for the AI Industry

Colossus 2's 1.5 GW milestone represents more than a flex — it reflects a fundamental shift in how AI competition is structured. The frontier of AI development is now gated by infrastructure, not algorithms. The labs that win in the next two to three years are the ones that can build and fill gigawatt-scale clusters first.

  • For enterprises: Grok 5, once released, will likely be the most capable real-time model available via API, with pricing competitive to Grok 4.20.
  • For NVIDIA: 850,000 GPUs represents one of the largest single-customer purchases in GPU history — a significant tailwind for H200 and next-gen Blackwell Ultra demand.
  • For energy: AI data centers are now a material factor in regional grid planning. Memphis's infrastructure has been significantly upgraded to accommodate Colossus 2's load.
  • For OpenAI and Anthropic: Both companies are behind on raw compute scale, which may explain the urgency around Stargate and AWS expansion deals announced in Q1 2026.

Frequently Asked Questions

What is xAI Colossus 2?

Colossus 2 is xAI's AI supercomputer in Memphis, Tennessee. It became the world's first gigawatt-scale cluster in January 2026 and expanded to 1.5 GW in April 2026, with approximately 850,000 NVIDIA GPUs.

What is Colossus 2 training?

Primarily Grok 5 — xAI's next flagship model, speculated to have ~6 trillion parameters in a MoE architecture. The cluster also handles inference and other xAI research workloads.

How does Colossus 2 compare to OpenAI's Stargate?

Colossus 2 at 1.5 GW is roughly three times larger by power than Stargate Abilene (~500 MW). Both OpenAI and Anthropic are targeting comparable scale but not until 2027 or later.

Is the 1.5 GW figure accurate?

xAI claims 1.5 GW contracted capacity. Independent satellite analysis suggests current cooling infrastructure supports ~350 MW of actual load, with the gap explained by phased ramp-up and liquid cooling configurations.

Want an AI agent that works across every model — Grok, Claude, GPT, Gemini — without switching apps?

Try HappyCapy Free
SharePost on XLinkedIn
Was this helpful?

Get the best AI tools tips — weekly

Honest reviews, tutorials, and Happycapy tips. No spam.

Comments