By Connie · Last reviewed: April 2026 — pricing & tools verified · This article contains affiliate links. We may earn a commission at no extra cost to you if you sign up through our links.
How Accurate Are Google AI Overviews? NYT Investigation Reveals the Truth (2026)
April 7, 2026 · 8 min read
Google AI Overviews are accurate 91% of the time after the Gemini 3 upgrade — but Google's own internal data shows they're wrong 28% of the time. Over 56% of correct answers are ungrounded(sources don't support the claim). The New York Times published a detailed investigation on April 7, 2026. Here's what the numbers actually mean.
When Google added AI Overviews to its search results, it promised a faster way to get answers. On April 7, 2026, The New York Times published the most thorough public investigation into how accurate those answers actually are — and the findings are more complicated than Google's marketing suggests.
The short version: AI Overviews are right most of the time, wrong often enough to matter, and frequently point to sources that don't actually back up what they say. At five trillion searches per year, “wrong 9% of the time” translates to hundreds of millions of incorrect answers per day.
The Numbers: What the NYT Investigation Found
The NYT investigation used data from Oumi, an AI research startup that benchmarked AI Overviews using the SimpleQA accuracy standard — an industry-recognized test of factual accuracy.
- With Gemini 2 (October 2025): AI Overviews were accurate 85% of the time.
- With Gemini 3 (February 2026): Accuracy improved to 91%.
- Google's internal benchmark: The company's own analysis found Gemini 3 generates incorrect information 28% of the time.
The discrepancy between 91% external accuracy and 28% internal error rate reflects different methodologies and what counts as “wrong.” Oumi uses a strict factual Q&A benchmark. Google's 28% figure likely captures a broader range of errors including partial answers and unclear attributions.
Google disputes Oumi's benchmark, arguing that SimpleQA contains its own errors and doesn't reflect the distribution of real-world search queries.
The “Ungrounded” Problem Is Worse Than Inaccuracy
The most significant finding in the investigation is not the raw accuracy rate — it is the “ungrounded” rate. An AI Overview answer is ungrounded when it appears correct but the sources Google links to do not actually support the claim.
In October 2025, 37% of correct AI Overview answers were ungrounded. By February 2026, that number had risen to 56% — even as the model became more accurate overall.
This means: more than half of all “correct” AI Overviews cannot be verified by clicking the source. The answer may be right, but Google cannot prove it with the sources it cites.
The practical implication for users is severe. The entire point of citing sources is to allow verification. If over half of cited sources don't support the claim, the citation is decorative rather than evidentiary.
The Source Quality Problem
Across 5,380 sources analyzed in the Oumi study, Facebook and Reddit were the second and fourth most-cited domains in AI Overviews. When AI Overviews were inaccurate, they cited Facebook 7% of the time — compared to 5% when accurate.
The pattern reveals a core challenge: AI Overviews are trained to find and synthesize information that appears authoritative, but user-generated social media content can rank highly in Google's index despite being unreliable. The AI cannot always distinguish between a peer-reviewed source and a Facebook post.
Specific Errors Documented by the NYT
| Query | AI Overview Claim | Reality |
|---|---|---|
| Bob Marley Museum opening date | Converted in 1987 | Opened May 11, 1986 |
| Hulk Hogan death | “No credible reports” of death | Contradicted by news articles below the AI summary |
| River bordering Goldsboro, NC (west side) | Neuse River | Little River (Neuse is southwest, not west) |
| Current year identification | Stated “2026 is next year” while also saying current year is 2026 | Basic logical contradiction |
Google displays a disclaimer below each overview: “AI can make mistakes, so double-check responses.” The company acknowledges errors are possible, but has not changed the prominent placement of AI Overviews above organic search results.
Health Queries: The Highest Stakes Category
The accuracy problem is most consequential in medical searches. A January 2026 Guardian investigation — cited in the NYT article — found that AI Overviews provided dangerous health advice in 44% of medical queries.
- Pancreatic cancer patients were advised to avoid high-fat foods — contradicting standard medical care guidance.
- Liver function test interpretation was significantly misleading.
- General medication queries returned information that differed from FDA-approved guidance.
In response, Google removed AI Overviews from a subset of specific health queries in January 2026. The company has not disclosed which queries were removed or how the decision was made.
Happycapy gives you access to Claude Opus 4.6, GPT-5.4, and Gemini — with full context on where each answer comes from. Free to start.
Try Happycapy FreeWhat This Means for SEO and Content Strategy
Google AI Overviews are already consuming traffic that used to go to publishers. HubSpot reported losing approximately 140 million visits in a single year as AI Overviews and zero-click search gutted its SEO traffic. The accuracy findings add a new dimension to this picture.
If AI Overviews are wrong 9–28% of the time, and the sources they cite are ungrounded in more than half of cases, the traffic loss is compounding with a credibility problem. Users clicking through to verify answers they should have been able to verify from the AI Overview will find the cited sources don't actually confirm what Google said.
For content creators and publishers, the strategic response is clear: write content that AI systems can cite accurately and with verifiable sourcing. This is the core principle behind Generative Engine Optimization (GEO) — structuring content so that AI systems can extract, attribute, and cite it correctly.
How to Protect Yourself as a User
- Always click through on critical queries. Medical, legal, financial, or time-sensitive factual queries should never be accepted from AI Overviews alone.
- Check the actual sources. If a source is a Facebook post or Reddit thread, treat it as anecdotal, not authoritative.
- Use AI tools designed for research. Dedicated AI research tools like Happycapy (with Perplexity-style search + Claude/GPT-5.4) provide more transparent sourcing than AI Overviews.
- Be especially skeptical for health queries. Google itself has removed AI Overviews from some medical search results after documented harms.
Google's Response
Google disputes the Oumi benchmark methodology, arguing it contains errors and doesn't reflect real-world query distribution. The company maintains that AI Overviews are more accurate than the base model because they query Google Search before generating responses — a retrieval-augmented approach that grounds answers in indexed content.
Google has not committed to a specific accuracy target, a regular public accuracy audit, or changes to how AI Overviews handle the ungrounded sourcing problem. The company's position is that the disclaimer (“AI can make mistakes, so double-check responses”) is sufficient disclosure.
Independent experts cited in the NYT investigation disagree. Pratik Verma of Okahu noted that Google's technology “is comparable to other leading AI systems” — which is itself an indictment, given that all leading AI systems hallucinate at meaningful rates.
Happycapy combines Claude Opus 4.6, GPT-5.4, and Gemini 3.1 in one workspace. Run deep research tasks, compare answers across models, and verify sources — no AI Overview guesswork.
Try Happycapy FreeFrequently Asked Questions
How accurate are Google AI Overviews?
External analysis (Oumi, using SimpleQA benchmark) puts accuracy at 91% after the Gemini 3 upgrade. Google's own internal analysis found AI Overviews are wrong 28% of the time. The gap reflects different methodologies. Both figures are material given that Google processes over 5 trillion searches per year.
What does “ungrounded” mean in the context of AI Overviews?
An “ungrounded” AI Overview answer is one that appears correct but whose cited sources do not fully support the claim. As of February 2026, over 56% of technically correct AI Overviews were ungrounded — meaning the answer may be right, but the evidence trail is broken.
Has Google removed AI Overviews for any topics?
Yes. Following a January 2026 Guardian investigation that documented dangerous health misinformation — including incorrect guidance for cancer and liver disease patients — Google removed AI Overviews from certain health queries. The full list of affected queries has not been made public.
Should I trust Google AI Overviews?
AI Overviews are useful for low-stakes general queries. For medical, legal, financial, or factually critical questions, always click through to verify the source — and check that the source actually supports the claim. Over half of AI Overview citations do not fully back up what Google says.
Sources
Get the best AI tools tips — weekly
Honest reviews, tutorials, and Happycapy tips. No spam.