HappycapyGuide

This article contains affiliate links. We may earn a commission at no extra cost to you if you sign up through our links.

Legal & Policy

Britannica and Merriam-Webster Sue OpenAI: What 100,000 Scraped Articles Mean for ChatGPT Users

March 30, 2026  ·  Happycapy Guide

TL;DR
On March 13, 2026, Encyclopedia Britannica and Merriam-Webster filed a federal copyright lawsuit against OpenAI in Manhattan, alleging ChatGPT was trained on nearly 100,000 reference articles without permission and sometimes reproduces them word-for-word. OpenAI is defending with a fair use argument. The outcome could determine whether AI companies owe licensing fees to publishers — and reshape how much the AI tools you rely on actually cost to run.
~100K
Britannica articles allegedly scraped for training
$0
licensing fees paid to publishers (alleged)
2
publishers named in the lawsuit (Britannica + Merriam-Webster)
3+
active AI copyright suits in 2026 across U.S. courts

What Happened

On March 13, 2026, Encyclopaedia Britannica and its wholly owned subsidiary Merriam-Webster filed a lawsuit in the U.S. District Court for the Southern District of New York. The complaint names OpenAI as the sole defendant and contains two core claims: copyright infringement and trademark infringement.

On copyright, Britannica alleges that OpenAI scraped and used nearly 100,000 of its online articles and Merriam-Webster dictionary entries as training data for ChatGPT without obtaining a license. The complaint cites examples in which ChatGPT outputs closely mirror — and in some cases reproduce verbatim — passages from Britannica's encyclopedic entries. The publishers argue this directly harms their web traffic and paid subscription revenue by giving users access to the content without visiting the source.

On trademark, the lawsuit alleges that ChatGPT occasionally fabricates information and attributes it to Britannica — a form of AI “hallucination” that Britannica says violates the Lanham Act by suggesting unauthorized endorsement and damaging the publisher's reputation for factual accuracy.

This is not Britannica's first AI lawsuit. In 2025, Britannica filed a similar suit against Perplexity AI, which remains ongoing. The March 2026 OpenAI complaint follows the same legal strategy but targets a far larger defendant with deeper pockets and a more prominent public profile.

OpenAI's Defense: Fair Use

OpenAI's position is consistent with its defense in the New York Times lawsuit filed in 2023 and several other copyright cases still working through courts. The company argues that:

U.S. courts have not yet ruled definitively on fair use as applied to AI training. The New York Times case is expected to produce the first major ruling — a decision that will set the precedent by which Britannica's case (and dozens of similar pending suits) will be decided.

Try Happycapy — Claude-powered AI built on transparent data practices, from $17/mo

Why This Matters for Everyday AI Users

If OpenAI loses, your AI tools get more expensive

If courts rule that training on copyrighted material requires licensing, OpenAI and every other AI company using similar data would face retroactive licensing fees and ongoing royalty payments. Those costs would almost certainly flow to consumers through higher subscription prices — or a fundamental change in what data future models can be trained on.

The verbatim reproduction concern is real

Britannica's complaint specifically calls out instances where ChatGPT produces near-verbatim text from Britannica articles. If you use ChatGPT to research topics covered by major reference works, the outputs you are reading may contain copyrighted prose. This matters most for professional use cases where the provenance of text is important — journalism, academic research, legal filings.

Trademark hallucinations are a separate legal risk

The trademark claim is less commonly discussed but arguably more immediately damaging to Britannica. When ChatGPT generates a false fact and a user asks “where did you get this?” and the model points to Britannica or Merriam-Webster, it damages those brands' 180-year reputation for accuracy. This type of misattribution is hard to prevent at scale and harder to detect.

AI Copyright Lawsuit Tracker (2023–2026)

PlaintiffDefendantFiledCore ClaimStatus
The New York TimesOpenAI + MicrosoftDec 2023Training on millions of NYT articles verbatimPre-trial discovery
Getty ImagesStability AIJan 202312M+ copyrighted images in training setOngoing
Authors Guild (class action)OpenAISep 2023Fiction used to train ChatGPTCertified class
Britannica / Merriam-WebsterPerplexity AI2025Reference articles as training + verbatim answersOngoing
Britannica / Merriam-WebsterOpenAIMar 13, 2026~100K articles + trademark misattributionFiled — active
Record labels (UMG, Sony, WMG)Suno + Udio2024Copyrighted music in audio model trainingSettled (undisclosed)
What about Anthropic? Anthropic, the company behind Claude (and the AI powering Happycapy), has taken a notably different approach to training data. Anthropic uses Constitutional AI training methods and has stated policies on data sourcing that prioritize consent and licensing where possible. No major publisher lawsuit has been filed against Anthropic to date, which reflects its more cautious posture on training data compared to OpenAI.

Frequently Asked Questions

What is the Britannica and Merriam-Webster lawsuit against OpenAI about?

Encyclopaedia Britannica and its subsidiary Merriam-Webster filed a copyright lawsuit against OpenAI on March 13, 2026, in Manhattan federal court. They allege that OpenAI scraped nearly 100,000 of their copyrighted reference articles and dictionary entries without permission to train ChatGPT, and that the model sometimes reproduces this content verbatim. The lawsuit also includes trademark infringement claims, arguing that AI hallucinations falsely attributed to Britannica damage the publisher's accuracy reputation.

Is ChatGPT safe to use given these copyright lawsuits?

For individual users, copyright lawsuits against OpenAI do not create personal legal risk — the liability sits with OpenAI, not you. However, the lawsuits raise legitimate questions about whether AI-generated content derived from disputed training data could face commercial use restrictions in the future, and whether ChatGPT responses accurately represent their sources. Users relying on AI for research, journalism, or published content should be especially aware of the verbatim reproduction concern.

What is OpenAI's defense in the Britannica lawsuit?

OpenAI argues that training on publicly available content qualifies as transformative fair use under U.S. copyright law — the same defense it is using in the New York Times case and other pending copyright suits. The company contends that AI systems learn statistical patterns rather than “copying” works in a traditional sense, and that requiring training data licenses would make large-scale AI development prohibitively expensive.

Which other publishers have sued AI companies for training data?

The list is growing. The New York Times sued OpenAI in December 2023. Getty Images sued Stability AI in January 2023. The Authors Guild filed a class action against OpenAI. Britannica sued Perplexity AI in 2025. Major record labels settled with Suno and Udio over audio model training in 2024. Each case is building a legal record that courts will use to eventually set binding precedent on fair use in AI training.

Happycapy Pro — Claude-powered agents, transparent AI, no ads, $17/mo
Sources
SharePost on XLinkedIn
Was this helpful?
Comments

Comments are coming soon.