Generative AI. Will copyright infringement lawsuits bury its future?

Many indications suggest that the year 2024 will be crucial for the further development of generative AI models trained on publicly available materials. Subsequent courts will decide whether and to what extent this technology infringes copyright.

Generative AI. Will copyright infringement lawsuits bury its future?
00:00 00:00

Summary

  • Generative artificial intelligence (genAI) based on large language models (LLM) has led to numerous lawsuits questioning copyright and the use of human creativity in training AI tools.
  • Getty Images sued Stability AI for intellectual property rights infringement, and music labels led by Universal Music sued Anthropic for using song lyrics without consent to train their AI model.
  • OpenAI, a Microsoft company, is facing several lawsuits, including a class action lawsuit filed by the Authors Guild on behalf of American writers, and a lawsuit from The New York Times alleging illegal use of articles to create genAI products, ChatGPT and Copilot.
  • The New York Times may demand compensation of up to $450 billion and the removal of all language models operating on the basis of the mentioned data.
  • OpenAI has previously entered into paid content usage agreements with Associated Press and Axel Springer publishing house, and believes that training large language models like ChatGPT is impossible without access to copyrighted content.
  • The lawsuit from The New York Times is seen as an existential threat to OpenAI and ChatGPT, and could significantly impact future industry practices and the business models of companies creating large language models.
  • The New York Times is the first media entity to sue for copyright infringement creators of a large language model, potentially setting a precedent for future cases.

Wave of lawsuits against companies creating AI tools

Last year's boom in generative artificial intelligence (genAI) based on large language models (LLM) has led to a series of fundamental questions about the practical use of these technologies. Among the most important are the limits of copyright and the extent to which companies like OpenAI can use human creativity to train their tools. The importance of these issues for the entire market is evidenced by the successive lawsuits that have been filed in courts over the past twelve months.

Let's just mention the loudest ones. In January 2023, the photographic agency Getty Images sued in the UK for infringement of intellectual property rights the company Stability AI, the creator of the Stable Diffusion tool. The case was deemed valid by the court and will be on the docket in the coming months. Meanwhile, in October, music labels led by Universal Music sued another giant in the USA – Anthropic – in connection with "countless cases" of using song lyrics without the consent of the authors to train the AI model Claude.

Several lawsuits are also awaiting Microsoft's OpenAI. The most important contemporary American writers, including John Grisham, Jonathan Franzen, George Saunders, Jodi Picoult, and George R.R. Martin accuse the company of "systematically stealing their work on a massive scale". A class action lawsuit filed on their behalf by the Authors Guild, the oldest and largest professional organization of writers in the USA, was submitted to the federal court in Manhattan at the beginning of September.

Artificial intelligence cannot be an inventor. British court rejected AI patent application
Stephen Thaler lost in the UK Supreme Court a case for obtaining patents for inventions created by his artificial intelligence system. This is the first such case in the UK, which also sheds light on the issue of whether AI can have patent rights.

The New York Times versus OpenAI. A lawsuit that will decide the future of artificial intelligence?

Probably the most important lawsuit against OpenAI and Microsoft was initiated by the American newspaper The New York Times. In the lawsuit filed on December 27, 2023, the newspaper's representatives argue that the companies illegally used articles from the newspaper to create their generative AI products - ChatGPT and Copilot.

In the complaint, it was stated, among other things, that OpenAI obtained "millions of copyrighted press articles, detailed research, opinions, reviews, guides and other materials, trying to feed for free on the vast resources of the newspaper".

The New York Times also argues that OpenAI profits from illegally used content and contributes to the "multi-billion losses of the newspaper". According to preliminary estimates, the newspaper may demand compensation of up to 450 billion dollars and demand the removal of all language models operating on the basis of the mentioned data.

And this means serious trouble for OpenAI. Especially since among American legal experts there is a fairly common belief that the NYT lawsuit is based on solid arguments.

AI vs AI. Artificial intelligence negotiated a legal agreement without human involvement
AI conducted negotiations of a legal agreement without human involvement. This is a clash between artificial intelligence and... artificial intelligence. The company Luminance fully entrusted AI with the negotiation of the agreement and as it says, they took place "without human interference, between two opposing parties".

The New York Times versus OpenAI. OpenAI's line of defense

In response to the allegations of The New York Times, representatives of OpenAI emphasize that the company "wants to support a healthy information ecosystem and co-create conditions beneficial for both parties", and also that it "respects the copyright of creators and owners of materials". This is evidenced by paid content usage agreements concluded at the end of last year, among others with Associated Press and Axel Spriner publishing house.

"We support journalism, we cooperate with news organizations and we believe that the lawsuit of The New York Times is unfounded" - we read in the company's blog post.

From court documents, to which the British newspaper The Telegraph has reached, it also appears that according to OpenAI, training large language models such as ChatGPT is impossible without access to copyrighted content.

"Since copyright today covers virtually every type of human expression - including blog posts, photos, forum posts, software code snippets, and government documents - it would be impossible to train contemporary leading artificial intelligence models without using copyrighted materials" - we read in OpenAI's opinion, to which The Telegraph has reached.

In this context, the lawsuit from NYT appears to be an existential threat to OpenAI and ChatGPT.

"The outcome of this extremely important case can undoubtedly completely change future industry practices. Free access to copyrighted materials seems to be the most important element on which the business model of companies creating LLMs is based; meanwhile, earning from the use of these materials is the driving force of media companies" - we read in the independent analysis of Daniel Vithoulkas, a lawyer from Monash University in Melbourne.

At the same time, it is worth noting that The New York Times is the first entity from the media-information industry to sue for copyright infringement creators of a large language model. It probably won't be the last, as other giants express serious concern about the way their content is used by generative AI. The stakes are therefore double: on the one hand, the fate of news media is at stake, and on the other - the future of technology that can completely revolutionize the world in the coming years.

Who is responsible for mustard gas in the fridge? AI system hallucinations
Scientists ask about non-existent sources, a supermarket chain suggests customers produce mustard gas, and in the USA a defamation lawsuit is launched because the popular ChatGPT persistently lies about a well-known person. All because of hallucinations - AI hallucinations.