Symbolbild zu KI und Urheberrecht: menschliche Hand trifft Roboterhand

18.01.2024 | KPMG Law Insights

AI and copyright – what is permitted when using LLMs?

By Dr. Anna-Kristine Wipper, Dr. Thomas Beyer

A few months ago, new players entered the legal scene and have since caused numerous legal discussions: Large Language Models (LLM), better known as ChatGPT, Azure OpenAI, PaLM 2 and co. Generative AI raises many questions, particularly in terms of copyright law: When do texts generated with LLMs infringe copyrights? And can copyrights arise for such texts? Does copyright law permit the reproduction and storage of data for the training of LLMs?

LLMs generate texts. They were trained with large amounts of data. They create texts by using their training data to predict the next elements of the generated text themselves. The LLM calculates the probability of word sequences – or sequences of tokens – and develops these independently into texts in the next step. The answers, i.e. the output of the LLMs, are based on the most probable word sequence, which is calculated from the words of the input (prompt).

Can the output infringe copyrights?

Good news: LLMs are not designed to create plagiarism. In contrast to Internet search engines, they do not search for existing texts and display them, but generate new texts. However, depending on the instructions that users give the LLM, this may still infringe copyrights. This is because there is a risk that the AI will generate output that is identical to a copyrighted work with which it has been trained.

In the following examples, generative AI may infringe copyrights:

When the AI is asked to reproduce a specific text, for example a song lyric that is not yet in the public domain,
if the AI merely translates copyrighted texts into another language and
if the AI reproduces parts of a text that enjoy their own copyright protection.

Since technical, informative texts are usually not protected by copyright, copyright infringement is less likely when processing such texts. Theoretically, however, it is still possible: if the author of the technical text has succeeded in showing creativity when writing it, a copyright might also have arisen in a factual text. And if the AI takes over these elements, it would be a violation.

Can copyrights arise from AI-generated texts?

The next question that arises in connection with GenAI and copyright is that of the copyright protection of the generated output: Can AI be used to create a work that enjoys copyright protection? And if so, who is entitled to the copyright? Here, too, there is no universal answer.

Under German law, only personal intellectual creations are eligible for protection. The creation must result from a person’s train of thought and be the result of a purposeful intellectual creative process. Accidental results, such as unintentional splashes of paint or a photo taken by a monkey, cannot claim copyright protection. Under German law, only a person with human intelligence and not an AI can be considered an author, and only a person can create copyright-protected works. It is crucial that authors are free in their creative decisions.

When using LLMs, as we are already accustomed to, the human users of the AI usually do not make sufficiently creative decisions. The written prompt may be a copyrighted work, but does not lead to protection of the output generated by the AI. As a rule, users have no significant influence on the machine execution, the actual production of the text.

However, there may be cases in which a different assessment is justified, namely when users use and operate the LLM as a tool that merely implements their personal creative intent. This could be compared somewhat more vividly to using a paintbrush. If the brush merely rolls over the paper, for example because it is dropped, no copyright-protected work is created, even if paint remains on the paper. However, if a painter deliberately swings the brush in a certain way, a protected painting can be created. If AI is used in a comparable way a copyright-protected work can indeed be created.

This immediately raises the question of who is the author of this work and who owns the rights to it. Various solutions are possible here. It could be the user of the AI alone, or it could be a joint work between the user and the AI programmer. This question will certainly keep copyright experts in the various legal systems busy for some time to come.

May data be duplicated and stored for AI training purposes?

Another key question: Does copyright law permit the reproduction and storage of data for the training of AI systems? And if so, how long can this data be stored?

Since 2021, the reproduction of lawfully accessible works for the purpose of text and data mining has been permitted under § Section 44b para. 2 UrhG. This means that digital or digitized works are analyzed automatically in order to extract information, in particular patterns, trends and correlations. The data is stored, i.e. duplicated. However, the data must be deleted when it is no longer required for text and data mining. But does this also apply to training an AI? There is still no case law on this issue. The explanatory memorandum to the law does indicate that Section 44b UrhG generally permits the reproduction and storage of lawfully accessible data for AI training. Even if the legislator probably did not think of large language models at the time we believe that they are covered. This is because the provision also takes appropriate account of the interests of the authors as they reserve the right to such use and can also prohibit it.

However, another question arises: How long may such training data be stored? Is there a time limit after which the data must be deleted, or does the justifying purpose continue as long as the AI is in operation? There is still no definitive answer to this question. It remains to be seen how legislation and case law will develop in these areas in the coming years.

Conclusion

The use of LLMs raises legal questions regarding the possible infringement of copyrights, the creation of new copyrights and the permissibility of reproducing and storing data for the training of LLMs.

In our opinion, the training of AI using lawfully accessible data is permitted, as the interests of copyright holders are adequately taken into account. However, it is unclear how long this data may be stored.

It remains to be seen how the courts will position themselves on these issues and whether the legislator will take further action here.

Explore #more

04.08.2025 | Deal Notifications

KPMG Law and KPMG AG advise NMP Germany on the acquisition of DESMA Schuhmaschinen GmbH

KPMG Law Rechtsanwaltsgesellschaft mbH (KPMG Law) has provided legal advice to NMP Germany GmbH (NMP) on the acquisition of DESMA Schuhmaschinen GmbH (DESMA). KPMG Law…

02.08.2025 | In the media

KPMG Law expert in the Rheinische Post on the topic of influencer tax evasion

The North Rhine-Westphalian State Office for Combating Financial Crime (LBF NRW) is currently evaluating a data package. It is said to contain 6000 data records.…

31.07.2025 | KPMG Law Insights

Modernizing the state and reducing bureaucracy: the plans in the 2025 coalition agreement

The coalition has set itself ambitious goals in the areas of bureaucracy reduction, state modernization and modern justice. And for good reason: comprehensive structural reforms…

31.07.2025 | KPMG Law Insights

AI in insurance companies – exploiting opportunities, managing risks

Insurance companies can use artificial intelligence (AI) to make their processes considerably more efficient. At the same time, special compliance requirements apply to the financial…

31.07.2025 | In the media

KPMG Law expert in Handelsblatt: New EU regulation affects 370,000 companies

At the end of the year, the EU will ban products associated with the destruction of forests. The hopes of many importers, who had hoped…

29.07.2025 | KPMG Law Insights

The Savings and Investment Union (SIU) – these are the EU’s plans

The EU lacks money in many areas, including for infrastructure, the expansion of digitalization and defence. At the same time, Europeans have large savings. These…

28.07.2025 | Deal Notifications

KPMG Law advises the shareholder of Schubert Touristik GmbH on the negotiation and implementation of a strategic partnership with the Austrian private equity firm AG Capital

The Schubert Group, headquartered in Aschersleben, specializes in organized and escorted coach, air and cruise trips worldwide, specially tailored to seniors aged 60 and over.…

25.07.2025 | Deal Notifications

KPMG Law advises BETOMAX, a company of INDUS Holding AG, on the acquisition of TRIGOSYS GmbH

KPMG Law Rechtsanwaltsgesellschaft mbH (KPMG Law) has provided legal advice to BETOMAX systems GmbH & Co KG, a company of INDUS Holding AG, on the…

24.07.2025 | Deal Notifications

KPMG Law and KPMG advise Q.ANT GmbH on a EUR 62 million Series A financing round

KPMG Law Rechtsanwaltsgesellschaft (KPMG Law) and KPMG AG Wirtschaftsprüfungsgesellschaft (KPMG) advised Q.ANT GmbH with a cross-service team on a Series A financing round with a…

23.07.2025 | KPMG Law Insights

Tax evasion by influencers: Why voluntary disclosure can help now

Further authors and contact persons: inside: Dr. Anne Schäfer, Marco Strootmann, Anastasia Podolak The tax authorities are targeting influencer marketing. Authorities in…

Contact

Dr. Anna-Kristine Wipper

Partner
Head of Technology Law

Heidestraße 58
10557 Berlin

Tel.: +49 30 530199731
awipper@kpmg-law.com

Dr. Thomas Beyer

Senior Manager

Heidestraße 58
10557 Berlin

Tel.: +49 30 530199822
thomasbeyer@kpmg-law.com

© 2024 KPMG Law Rechtsanwaltsgesellschaft mbH, associated with KPMG AG Wirtschaftsprüfungsgesellschaft, a public limited company under German law and a member of the global KPMG organisation of independent member firms affiliated with KPMG International Limited, a Private English Company Limited by Guarantee. All rights reserved. For more details on the structure of KPMG’s global organisation, please visit https://home.kpmg/governance.

KPMG International does not provide services to clients. No member firm is authorised to bind or contract KPMG International or any other member firm to any third party, just as KPMG International is not authorised to bind or contract any other member firm.