Two Authors Accuse Apple of Illegally Training AI Models with Pirated Books

Apple is now facing legal action from authors who allege that the tech giant used their copyrighted books, without consent or compensation, to train its artificial intelligence systems.

Kylo B

9/7/20252 min read

Two Authors Accuse Apple of Illegally Training AI Models with Pirated Books

Apple is now facing legal action from authors who allege that the tech giant used their copyrighted books, without consent or compensation, to train its artificial intelligence systems.

The Lawsuit Unveiled

Authors Grady Hendrix and Jennifer Roberson filed a proposed class-action lawsuit in the U.S. District Court for Northern California, claiming that Apple illegally used their works in training its OpenELM large language model. The lawsuit asserts that Apple sourced the content from a known collection of pirated e-books, including the dataset Books3, which originated from “shadow library” sites like Bibliotik and was linked to the RedPajama data collection technique. Meanwhile, the authors contend that Apple did not credit or compensate them despite using their works in what could become a highly profitable venture.Reuters iThinkDifferent AppleInsider

What the Plaintiffs Want

The authors are seeking:

Statutory and compensatory damages
Restitution of the unfair profits Apple derived
Attorneys’ fees
Possible destruction of AI models (like Apple Intelligence or OpenELM) that were trained using the disputed content
Additionally, they request a jury trial and class-action certification to include other affected authors.iThinkDifferent AppleInsider

Apples' Claim of Ethical Training vs. Data Origins

Apple has emphasized its commitment to ethical AI development, including licensing agreements with publishers and reliance on its AppleBot crawler that respects robots.txt directives. Yet the lawsuit argues that despite these efforts, Apple still relied on data indirectly sourced from pirated materials, a contradiction that lies at the heart of the legal challenge.AppleInsider THE DECODER iThinkDifferent

Part of a Larger Legal Wave

Apple’s case joins a growing number of lawsuits targeting AI companies for training models with copyrighted texts:

Anthropic recently settled a lawsuit for $1.5 billion, covering over 500,000 books used without permission in training its Claude chatbot.The Wall Street Journal Tom's Hardware Financial Times Reuters AppleInsider
Microsoft, OpenAI, and Meta are likewise facing legal scrutiny for their use of copyrighted books in AI training.Reuters The Washington Post The Verge Wikipedia Vanity Fair

These cases illustrate rising tension between AI innovation and the protection of creative works.

The Fair Use Debate

The core legal battleground centers on whether AI training qualifies as fair use, a defense some companies like Apple aim to leverage. However, courts are increasingly skeptical, particularly when pirated or unauthorized content is involved.

In Anthropic’s case, while some usage was deemed transformative, judges found that knowingly including pirated works was not protected under fair use.Financial Times Wikipedia+1 The Washington Post Vanity Fair

If Apple’s deployment of Books3, and its downstream use via OpenELM, is confirmed, it could similarly undermine their fair-use defense.

Why It Matters

StakeholderImplicationsAuthors & PublishersA potential turning point for compensation and control over AI-used content.AI IndustryPressure to adopt transparent, licensed datasets and rethink training models.Consumers & RegulatorsSets precedents in intellectual property norms that shape future AI ethics.

The lawsuit filed by Hendrix and Roberson against Apple underscores a growing concern: that some AI models are built upon creatively generated works without permission or credit. As the suit progresses, its outcome could define new boundaries around AI training, fair use, and the rights of creators, particularly when pirated content is involved. The implications may reverberate far beyond Apple, influencing how AI technologies evolve and respect intellectual property.

Two Authors Accuse Apple of Illegally Training AI Models with Pirated Books

Two Authors Accuse Apple of Illegally Training AI Models with Pirated Books

The Lawsuit Unveiled

What the Plaintiffs Want

Apples' Claim of Ethical Training vs. Data Origins

Part of a Larger Legal Wave

The Fair Use Debate

Why It Matters

Subscribe to our newsletter