NEW YORK — The New York Times has sued OpenAI and Microsoft for copyright infringement, alleging that the companies’ artificial intelligence technology illegally copied millions of Times articles to train ChatGPT and other services to provide people with information — technology that now competes with the Times.
The complaint is the latest in a string of lawsuits that seek to limit the use of alleged scraping of wide swaths of content from across the internet — without compensation — to train so-called large language artificial intelligence models. Actors, writers, journalists and other creative types who post their works on the internet fear that AI will learn from their material and provide competitive chatbots and other sources of information without proper compensation.
But the Times’ suit is the first among major news publishers to take on OpenAI and Microsoft, the most recognizable AI brands.
In a complaint filed Wednesday, the Times said that Microsoft and OpenAI’s “unlawful use of The Times’s work to create artificial intelligence products that compete with it threatens The Times’s ability to provide that service.” The paper noted that OpenAI and Microsoft used other sources in its “widescale copying,” but “they gave Times content particular emphasis” seeking “to free-ride on The Times’s massive investment in its journalism by using it to build substitutive products without permission or payment.”
Microsoft and OpenAI did not immediately respond to a request for comment on the lawsuit.
The Times, in its complaint, said that it objected when it discovered months ago that its work had been used to train the companies’ large language models. Starting in April, the Times said it began negotiating with OpenAI and Microsoft to receive fair compensation and set terms of an agreement.
But the Times alleges it has been unable to reach a resolution with the companies. Microsoft and OpenAI claim that the Times’ works are considered “fair use,” which gives them the ability to use copyrighted material for a “transformative purpose,” the complaint states.
“There is nothing ‘transformative’ about using The Times’s content without payment to create products that substitute for The Times and steal audiences away from it,” the Times said in its complaint. “Because the outputs of Defendants’ GenAI models compete with and closely mimic the inputs used to train them, copying Times works for that purpose is not fair use.”
The Times is among a number of leading newsrooms, also including CNN, who earlier this year added code to their websites that blocks OpenAI’s web crawler, GPTBot, from scanning their platforms for content.
The Times claims that because the AI tools have been trained on its content, they can “generate output that recites Times content verbatim, closely summarizes it, and mimics its expressive style, as demonstrated by scores of examples … These tools also wrongly attribute false information to The Times,” the complaint states.
The news outlet also alleges that Microsoft’s Bing search engine, which was upgraded earlier this year with OpenAI’s technology, “copies and categorizes” Times content to produce longer and more detailed responses than traditional search engines.
“By providing Times content without The Times’s permission or authorization, Defendants’ tools undermine and damage The Times’s relationship with its readers and deprive The Times of subscription, licensing, advertising, and affiliate revenue,” the complaint states.
The New York Times Executive Vice President and General Counsel Diane Brayton told the outlet’s staffers in a memo Wednesday morning that, “We recognize the potential of [generative AI] for the public and for journalism.”
“But at the same time, we believe that the success of GenAI and the companies developing it need not come at the expense of journalistic institutions,” according to the memo, which was obtained by CNN. “The use of our work to create GenAI tools must come with permission and an agreement that reflects the fair value of that work, as the law provides.”
With its lawsuit, the Times is seeking unspecified monetary damages, as well as a permanent injunction that would prevent Microsoft and OpenAI from continuing the alleged infringement. The Times is also seeking the “destruction” of GPT and any other AI models or training sets that incorporate its journalism.
The-CNN-Wire™ & © 2023 Cable News Network, Inc., a Warner Bros. Discovery Company. All rights reserved.