“We will have a trial on the pirated copies used to create Anthropic’s central library and the resulting damages, actual or statutory (including for willfulness),” U.S. District Court Judge William Alsup said.
A recently-issued court order could give artificial intelligence companies the legal right to train large language models, or LLMs, on copyright-protected works, provided that such materials are obtained lawfully.
According to National Public Radio, the ruling is significant, in part, because it marks the first time that a court has made a definite ruling on how fair use principles apply to artificial intelligence systems. Filed on behalf of a group of authors, including Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson, the lawsuit alleged that Anthropic wrongfully appropriated the contents of digitized books in the training of its signature product, Claude.
The lawsuit indicates that, in training Claude, Anthropic used at least two copyright-protected works by each plaintiff. In some instances, Anthropic went so far as to purchase hardcover books, which it scanned and fed to its LLM.
“Rather than obtaining permission and paying a fair price for the creations it exploits, Anthropic pirated them,” the lawsuit alleged.
But, in a Monday order, Senior U.S. District Judge William Alsup found that Anthropic’s use of the authors’ work falls under fair use doctrine.
In general, fair use doctrine lets third parties utilize copyright-protected works without the holder’s consent for purposes such as teaching, scholarship, and satire. Artificial intelligence companies, from Anthropic to competitors Meta and OpenAI, maintain that the copyright-protected works they use to train their respective LLMs falls under fair use exceptions.

“The training use was a fair use,” Alsup wrote in his order. “The use of books at issue to train Claude and its precursors was exceedingly transformative.”
Alsup also addressed concerns related to Anthropic’s purchase of hardcover books, saying that even this could be considered fair use “because all Anthropic did was replace the print copies it had purchased for its central library with more convenient space-saving and searchable digital copies for its central library—without adding new copies, creating new works, or redistributing existing copies.”
However, Alsup noted that not Anthropic did not pay for many of the books that it used to train Claude. Anthropic “downloaded for free millions of copyrighted books in digital form from pirate sites on the internet” as part of its effort to “amass a central library of ‘all the books in the world’ to retain ‘forever.’”
Alsup explicitly rejected Anthropic’s argument that “the pirated library copies must be treated as training copies,” and will allow the authors’ piracy-related complaints to proceed to trial if a settlement cannot be reached.
“We will have a trial on the pirated copies used to create Anthropic’s central library and the resulting damages, actual or statutory (including for willfulness),” Alsup said.
Anthropic has praised Alsup’s decision, but said it does not believe any element of the case should proceed to trial.
In a statement to National Public Radio, Anthropic said that, “Consistent with copyright’s purpose in enabling creativity and fostering scientific progress, Anthropic’s large language models are trained upon works not to race ahead and replicate or supplant them, but to turn a hard corner and create something different.”
On the subject of alleged piracy, Anthropic stressed that it only used the authors’ copyright-protected works for a singular and allegedly lawful purpose.
“We believe it’s clear that we acquired books for one purpose only—building large language models—and the court clearly held that use was fair,” Anthropic said.
The authors’ remaining claims of piracy are scheduled to move to trial in December.
Sources
Anthropic wins key AI copyright case, but remains on the hook for using pirated books
Join the conversation!