Judge Says AI Training is Fair Use, But Questions Data Storage Practices

Ryan J. Farrick — June 24, 2025

Computer keyboard with blue-lit AI key; image by BoliviaInteligente, via Unsplash.com.

In his decision, U.S. District Judge William Alsup said that, even if Anthropic abided by the law in the training of its LLMs, it most likely ran afoul of copyright protections when it stored the plaintiffs’ works in a “central library.”

A San Francisco-based federal judge has determined that Anthropic’s use of copyright-protected books in the training of its artificial intelligence system is legal under U.S. law.

In his Monday ruling, U.S. District Judge William Alsup said that Anthropic made “fair use” of books published by the plaintiffs, who include authors Andrea Bartz, Charles Graeber, and Kirck Wallace.

Anthropic’s signature program, Claude, is a type of large language model (LLM).

Similar to other LLMs, including competitor OpenAI’s ChatGPT, Claude is trained on large amounts of text pulled from various sources. During training, the LLM’s outputs are repeatedly refined until they can generate human-like responses.

In a court hearing, attorneys for Anthropic said that using copyright-protected material in this manner is permissible because U.S. law “not only allows, but encourages” the training of artificial intelligence models. Anthropic also claimed that the sole purpose of copying the plaintiffs’ work was to study their “writing, extract uncopyrightable information from it, and use what it learned to create revolutionary technology.”

Alsup also seemed amenable to Anthropic’s argument that its training processes are “exceedingly transformative.”

Gavel on copy of lawsuit; image by Wirestock, via Freepik.com.

“Like any reader aspiring to be a writer, Anthropic’s LLMs trained upon works not to race ahead and replicate or supplant them—but to turn a hard corner and create something different,” he wrote.

However, Alsup’s ruling was not a unilateral win for Anthropic.

In his decision, Alsup said that, even if Anthropic abided by the law in the training of its LLMs, it most likely ran afoul of copyright protections when it stored the plaintiffs’ works in a “central library.”

Anthropic, like many of its competitors, is facing similar claims filed by other authors, media outlets, and for-profit businesses. Social media platform Reddit, for instance, is currently litigating its own claims against Anthropic, saying that it illegally “scraped” the comments of millions of Reddit users to train Claude.

“AI companies should not be allowed to scrape information and content from people without clear limitations on how they can use that data,” Reddit chief legal officer Ben Lee said in a statement.

Lee noted that Reddit is not opposed to artificial intelligence and has, in the past, entered licensing agreements with companies like Google and OpenAI. Those agreements, Lee said, “enable us to enforce meaningful protections for our users, including the right to delete your content, user privacy protections, and preventing users from being spamming using this content.”

Sources

Anthropic wins key US ruling on AI training in authors’ copyright lawsuit

Reddit sues AI company Anthropic for allegedly ‘scraping’ user comments to train chatbot Claude

In his decision, U.S. District Judge William Alsup said that, even if Anthropic abided by the law in the training of its LLMs, it most likely ran afoul of copyright protections when it stored the plaintiffs’ works in a “central library.”

Sources

Join the conversation!

Trending