·  Legal News, Analysis, & Commentary

Lawsuits & Litigation

Lawsuit: OpenAI, Microsoft Misused “Tens of Thousands” of Copyrighted Nonfiction Works

— November 22, 2023

Attorneys for the lead plaintiff claim that Microsoft should be found partially liable for OpenAI’s alleged copyright infringement, given the extent and scope of its investments in OpenAI.

A recently-filed lawsuit accuses OpenAI and Microsoft of misusing the works of different nonfiction authors to train artificial intelligence models.

According to Reuters, the lawsuit alleges that OpenAI copied, analyzed, and repurposed the texts of tens of thousands of books. These texts were used to train language models, like OpenAI’s popular ChatGPT, to respond to human queries and other inputs.

However, Hollywood Reporter editor Julian Sancton—lead plaintiff in the proposed class action—says that neither OpenAI nor Microsoft ever obtained permission from authors.

Sancton, adds the Reporter, said that Microsoft has been “deeply involved in the training, development, and commercialization” of OpenAI’s GPT-based products, and may therefore be liable for copyright infringement.

Microsoft has, for instance, provided OpenAI with specialized computing systems, which it uses to train its language models.

A gavel. Image via Wikimedia Commons via Flickr/user: Brian Turner. (CCA-BY-2.0).

“Microsoft’s Azure provided the cloud computing systems that powered the training process, and continues to power OpenAI’s operations to this date,” the lawsuit alleges. “Without these bespoke computing systems, OpenAI would not have been able to execute and profit from the mass copyright infringement alleged herein.”

Attorneys for Sancton and the class observed that Microsoft C.E.O. Satya Nadella recently told CNBC that, “beneath what OpenAI is putting out as large language models, remember, the heavy lifting was done by the Azure team to build the compute infrastructure.”

The lawsuit suggests that Nadella’s comments were in reference to Microsoft’s involvement in the development of OpenAI’s computing system.

In total, Sancton estimates that Microsoft invested somewhere near $13 billion into OpenAI.

The Hollywood Reporter notes that, unlike similar claims against OpenAI, Sancton’s lawsuit states that the defendant corporations “directly” made tens of thousands of unlicensed copies of clearly copyrighted works.

“While OpenAI was responsible for designing the calibration and fine-tuning of the GPT models—and thus, the largescale copying of this copyrighted material involved in generating a model programmed to accurately mimic Plaintiff’s and others’ styles—Microsoft built and operated the computer system that enabled this unlicensed copying in the first place,” said Sancton’s attorneys, who argue that the scope of Microsoft’s investment in, and collaboration with, OpenAI should have made it apparent that copyrighted works were being abused.

Sanction has since said that he, along with other authors, find it troubling that large companies like Microsoft and OpenAI can use their words and ideas without permission.

“It is concerning for anyone who writes for a living to see our work be used without permission or compensation to build large language models that capitalize on our expression for profit,” Sancton said.


New Lawsuit Ropes Microsoft Into OpenAI’s Legal Battle With Authors Over Training Data

Nonfiction authors sue OpenAI, Microsoft for copyright infringement

OpenAI, Microsoft hit with new author copyright lawsuit over AI training

Join the conversation!