The Globalization of Copyright Exceptions for AI Training

Citation: Matthew Sag and Peter K. Yu, The Globalization of Copyright Exceptions for AI Training, 74 Emory Law Journal 1163 (2025)

Summary

The Globalization of Copyright Exceptions for AI Training argues that despite different legal traditions and local conditions, countries worldwide are converging toward an international equilibrium that allows text and data mining, computational data analysis, and AI training without express authorization in some circumstances, driven by the centrality of the idea-expression distinction, global AI competition, and a race to the middle in copyright law reforms.

The article demonstrates an emerging global consensus on copyright exceptions for AI training across jurisdictions with varying legal systems. Through a comprehensive survey of countries including the United States, Israel, Japan, the United Kingdom, the European Union, Singapore, China, and the UAE, we show that while approaches differ in form—ranging from fair use regimes to express exceptions for text and data mining—they converge in substance around allowing nonexpressive uses of copyrighted works for AI training.

We identify three key forces driving this convergence: the universal importance of the idea-expression distinction in copyright law, competitive pressures in the global AI race, and countries’ tendency to adopt middle-path approaches rather than extreme positions. However, we also note potential disruptions to this equilibrium, including ongoing litigation in the United States, licensing deals between AI developers and content providers, and regulatory developments like the EU AI Act.

Why read this article?

The Globalization of Copyright Exceptions for AI Training provides comprehensive coverage of the global landscape of copyright exceptions for AI training as of 2025, offering detailed analysis of how different legal systems have adapted to address machine learning and generative AI.

The article includes systematic comparison tables showing the affordances different jurisdictions provide for AI training, examining factors like commercial versus noncommercial use, lawful access requirements, and opt-out mechanisms. The paper also provides valuable historical context by tracing the evolution of “nonexpressive use” doctrine from earlier technologies like reverse engineering and search engines to current AI applications, and offers practical guidance for policymakers considering copyright reform in this area

Further Reading

Pamela Samuelson, Christopher Jon Sprigman & Matthew Sag, Comments in Response to the Copyright Office’s Notice of Inquiry on Artificial Intelligence and Copyright (2023) – This comprehensive submission to the U.S. Copyright Office provides detailed analysis of how fair use doctrine should apply to AI training, arguing that most AI training constitutes nonexpressive use and should be protected.

Oren Bracha, The Work of Copyright in the Age of Machine Production, 38 Harvard Journal of Law & Technology 171 (2024) – This article argues that nonexpressive training copies don’t infringe copyright from the outset due to basic principles determining what subject matter lies within copyright’s domain, offering a more fundamental challenge to infringement claims than fair use defenses.

João Pedro Quintais, What is a “Research Organisation” and Why it Matters: From Text and Data Mining to AI Research, 74 GRUR International 397 (2025) – This piece analyzes the concept of “research organization” under the EU’s Digital Single Market Directive and its implications for text and data mining exceptions in AI research contexts.

Alexander Peukert, Copyright in the Artificial Intelligence Act—A Primer, 73 GRUR International 497 (2024) – This article examines the copyright provisions in the EU AI Act, analyzing how they interact with existing copyright exceptions and their potential extraterritorial effects on AI development.