Tag Archives: copyright

NAFTA must include fair use commitments

I joined with over seventy international copyright law experts today in calling for NAFTA and other trade negotiators to support a set of balanced copyright principles.

Policies like fair use, online safe harbors, and other exceptions and limitations to copyright permit and encourage access to knowledge, flourishing creativity, and innovation.

The following copyright principles are essential to ensure consumers’ digital rights. Copyright law should:

  • Protect and promote copyright balance, including fair use
  • Provide technology-enabling exceptions, such as for search engines and text- and data-mining
  • Include safe harbor provisions to protect online platforms from users’ infringement
  • Ensure legitimate exceptions for anti-circumvention, such as documentary filmmaking, cybersecurity research, and allowing assistive reading technologies for the blind
  • Adhere to existing multilateral commitments on copyright term
  • Guarantee proportionality and due process in copyright enforcement

Measuring the value of copyright and the value of copyright exceptions is methodologically challenging, but if we use the same criteria that WIPO adopts to estimate the value of copyright, then in the U.S., fair use industries represent 16% of annual GDP and employ 18 million American workers.

The Washington Principles on Copyright Balance in Trade Agreements and the new research on Measuring the Impact of Copyright Balance are located at http://infojustice.org/flexible-use

Text Mining, Non-Expressive Use and the Technological Advantage of Fair Use

On March 29, 2017, I attended a fantastic conference on “Globalizing Fair Use: Exploring the Diffusion of General, Open and Flexible Exceptions in Copyright Law” hosted by American University Washington College of Law’s Program and Information Justice and Intellectual Property. As part of that event we held a webcast Q&A session moderated by Sasha Moss of the R Street Institute. The following is rough transcript of my comments in response to Sasha’s questions about the legality of the non-expressive use copyrighted works.

Copyright Questions For the Digital Age

There is no country in the world where simply reading a book and giving someone information about the book, such its subject or themes, whether it uses particular words or particular combinations of words, the number of words, the number of pages, the ratio of female to male pronouns, etc., would amount to copyright infringement.

Why? Because information about the book is not the book. It is metadata. The question for the digital age is, “Can we use computers to produce that kind of data?” This question is important because although I can read a few books and produce some useful metadata, I can’t read a million books. But a computer can.

We have the technology

We have the technology to digitize large collections of books in order to produce data that enables computer scientists, linguists, historians, English professors, and the like, to answer important research questions. The data and the questions it can be used to answer do nothing to communicate the original expression of all those millions of books. However, technically speaking, this kind of digitalization is still copying.

But is this the kind of copying that copyright law should be concerned about? If a tree falls in an empty forest, does it truly make a sound? If something is copied but only read by a computer and the computer only communicates metadata about the work, is that the kind of copying this should amount to copyright infringement?

Text mining is vital for machine learning, automatic translation, and developing the language models

It seems to me, that once you phrase the question that way the answer is clear. We all use this amazing technology on a daily basis when we rely on Internet search engines, but text mining use is about much more than this. By data mining vast quantities of scientific papers, researchers have been able to identify new treatments for diseases. Text mining has also allowed humanities scholars to identify patterns in vast libraries of literature. Text mining is vital for machine learning, automatic translation, and developing the language models the power dictation software.

Fair use and technological advantage

The United States is a world leader in various applications of text mining, starting with Internet search, but going far beyond that. In the United States, once people realized what was possible they more or less start doing it. If Larry Page and Sergy Brin had had the idea for the Google Internet search engine in Canada, Australia, England, or Germany in the 1990s it would have been crystal-clear that because their search engine relied on making copies of other people’s HTML webpages and there was no realistic way to obtain permission from all those people, building search engine would be illegal. In countries with a closed list of copyright exceptions and limitations, or with fair dealing provisions that are tied to specific narrowly defined purposes, a lawyer would have looked at the list and said, “I don’t see Internet search or data mining on that list, so you can’t do it.”

The fair use doctrine reinforces copyright rather than negating it

In the United States, we have the fair use doctrine, which means that the list is not closed. In the United States, the fair use doctrine means you at least get a chance to explain why your particular use of a copyrighted work is for a purpose that promotes the goals of copyright, is reasonable in light of that purpose, and is unlikely to harm the interests of copyright owners. The fair use doctrine reinforces copyright rather than negating it; fair use doesn’t mean that you get to do whatever you want. Fair use is a system for determining how copyright should apply in new situations. That is especially important whether the law was written decades ago and society and technology are changing fast.

Without something like fair use, other countries can only follow the United States

Without something like fair use, other countries can only follow the United States. Non-expressive uses of copyrighted works such as text mining, building an Internet search engine, or running plagiarism detection software have all been held to be fair use in the United States and are slowly becoming more accepted around the world. Of course, now that it is readily apparent that these activities are immensely beneficial and entirely non-prejudicial to the interests of copyright owners we could probably write some specific amendments to the copyright act to make them legal. The problem is, do we didn’t know this two decades ago when we actually needed those rules. I don’t know what the next thing that we don’t know is, but I do know that experience has shown that the flexibility of the fair use doctrine—which has been part of copyright law virtually since the English Statute of Anne in 1710, by the way—has worked better than a system of closed lists.

The fair use doctrine is a real source of competitive advantage for technologists and academic researchers in the United States. Right now, there are technologies being developed and research being done in the United States that either can’t be done in other countries, or can only be done by particular people subject to various arbitrary restrictions. Whether it’s Internet search, digital humanities research, machine learning or cloud computing, other countries have followed the United States in adopting technologies that make non-expressive use of copyrighted works, because some of the copyright risks begin to look less daunting once the practice has become accepted. The Europeans, for example, are pretty sure building a search engine must be legal, but they can’t quite agree why. But the thing to understand is that you can follow this way but you can never lead. It’s much harder to do the new thing if by the letter of the law it is illegal and you have no forum to argue that it should be allowed.

The future doesn’t have a lobby group

Of course, that’s not quite true, you have one forum … you can spend a vast amount of money are lobbyists and go to the government, go to Congress and try to get some favorable rules written. But even if that is successful from time to time, those rules have a particular character. A company that spends millions of dollars on a lobbying campaign to change the law is always going to try and make sure that those new rules only benefit its business. Special interests will get some laws changed, but usually in ways that disadvantage their competitors or exclude alternative technologies that might one day compete with them. The fundamental problem with relying on static lists of copyright exceptions and lobbying to get those lists revised as needed is that the future doesn’t have a lobby group.

If you would like to read more about these topics:

Loyola is hosting the Society for Economic Research on Copyright Issues Annual Meeting Today

The SERCI Annual Congress 2016 is being held at Loyola University Chicago School of Law, Chicago, 7-8th July and is co-hosted by University of Illinois College of Law.

The Society for Economic Research on Copyright Issues or SERCI was established in 2001 to provide a solid academic platform for the application of economic theory to copyright policy.

The complete program is posted online at http://www.serci.org/congress.htm.

My slides for my presentation on empirical studies of copyright litigation are available here.

CREATIVE DIGITAL ARTS — Event Announcement

CREATIVE DIGITAL ARTS FREE PUBLIC EVENT

“CREATIVE DIGITAL ARTS: Copyrights and Pathways to Success” is a free public event that has organized by faculty at Loyola Law School Chicago, Northwestern Law School and Columbia College and a group of Chicago lawyers and business leaders. The event is on Tuesday, April 26, 2016 from 5-8pm at Columbia College.
.
The CREATIVE DIGITAL ARTS event will feature photographers, digital strategists, intellectual property attorneys and law professors. Our speakers will address the essentials of copyright law for today’s artists and lessons from working artists and professionals in creative industries. The event will conclude with a pizza reception and a chance for one-on-one Q&A with the speakers.
.
A link to the online registration and further details are included in the attached flyer (Creative Digital Arts (World IP Day)). Email WorldIPDayChicago@gmail.com with questions.

(4/4) 2015 Data on the geographic distribution of US IP litigation

2015 Update

This is the fourth and final post in a series discussing the 2015 Update to the data in my forthcoming article, IP Litigation in United States District Courts: 1994 to 2014 (Iowa Law Review, forthcoming 2016). You can read the 2015 Update in serial form in the posts that follow, or you can download the entire update as a pdf file from ssrn.com. See Suggested citation, Matthew Sag, IP Litigation in United States District Courts2015 Update (January 5, 2016). Available at SSRN: http://ssrn.com/abstract=2711326.

The previous post discussed recent trends in patent litigation and the true nature of the patent litigation explosion. This post concludes with an update to the data concerning the geographic distribution of copyright, patent and trademark litigation in US district courts.

4. The geographic distribution of copyright, patent and trademark litigation

In  IP Litigation in United States District Courts: 1994 to 2014, I discussed at length the geographic distribution of copyright, patent and trademark litigation. I have updated the key figures and tables from that discussion below. Figure 6 (below) illustrates how the copyright, patent and trademark litigation rankings of selected districts have varied from 1994 to 2015.

Figure 6 Copyright, Patent and Trademark Litigation Rankings by District 1994—2014

Spagetti Districts Combined

Source: Administrative Office of the U.S. Courts, PACER records, 1994—2015.

The associated tables are not particularly easy to read in this format, but I have included them for completeness. They are available in text form in the pdf version of this Update, see Matthew Sag, IP Litigation in United States District Courts2015 Update (January 5, 2016)(http://ssrn.com/abstract=2711326).

Screen Shot 2016-01-05 at 11.27.51 AM

Screen Shot 2016-01-05 at 11.28.11 AM

Screen Shot 2016-01-05 at 11.28.25 AM

End of post

(2/4) 2015 Data on Copyright Litigation in the US

2015 Update

This is the second  in a series of posts discussing the 2015 Update to the data in my forthcoming article, IP Litigation in United States District Courts: 1994 to 2014 (Iowa Law Review, forthcoming 2016). You can read the 2015 Update in serial form in the posts that follow, or you can download the entire update as a pdf file from ssrn.com. See Suggested citation, Matthew Sag, IP Litigation in United States District Courts2015 Update (January 5, 2016). Available at SSRN: http://ssrn.com/abstract=2711326.

The previous post discussed the overall state of copyright, patent and trademark litigation, this post addresses the new data on copyright litigation and the John Doe phenomenon.

2. Copyright Litigation and the John Doe Phenomenon

Figure 2 (above), shows how the dramatic increase in copyright litigation from 2010 to 2015 is almost exclusively attributable to litigation against anonymous Internet file sharers. The figure shows the number of copyright cases including (dashed red line) and excluding (solid red line) cases filed against John Doe defendants. As discussed in Copyright Trolling, An Empirical Study, and IP Litigation in United States District Courts: 1994 to 2014, the rise of Internet filesharing has transformed copyright litigation in the United States. Federal district courts are currently inundated with copyright owner lawsuits against “John Doe” or “unknown” or otherwise unidentified defendants. Figure 3 (below) tracks the occurrence of these John Doe lawsuits from 1994 through 2015. These John Doe lawsuits are almost exclusively related to allegations of illegal filesharing, which explains why they were virtually non-existent prior to 2004.

Figure 3: Copyright Cases Filed in U.S. District Courts (1994—2015)

Figure 3

Source: Administrative Office of the U.S. Courts, PACER records, 1994—2015.

John Doe litigation has not been a static phenomenon. The current era of BitTorrent monetization began in 2010 with a handful of cases filed against large numbers—sometimes more than 5000—IP addresses. Filing suits in this way enabled plaintiffs to economize on filing fees but courts have become significantly more skeptical of the legality and desirability of mass joinder in BitTorrent cases. Based on the data from 2015, it seems that the era of mass joinder is almost completely over. As seen in Table 2 (below), in 2010 the average number of John Doe defendants per suit was over 560; by 2014 it was just over 3 and in 2015 it was barely over 2. As Table 2 also shows, although pornography still dominates John Doe litigation, the phenomenon is becoming increasingly mainstream.

Table 2 John Doe Copyright Cases 2010—2015

Screen Shot 2016-01-05 at 11.03.14 AM

John Doe and pornography cases identified by the author. In 2015 John Doe litigation made up almost 58% of the federal copyright docket (2930 cases out of 5076). One of the most interesting aspects of this phenomenon is that it is driven by such a small number of plaintiffs and— based on my random sampling of the underlying complaints—an even smaller number of lawyers.

Table 3 shows the five most prolific John Doe litigants for each year from 2011 to 2015. In 2015 Malibu Media was still the most significant individual copyright plaintiff in the US; in fact, it filed more suits than ever last year. However, Malibu Media accounted for a smaller percentage of copyright lawsuits in 2015 because other plaintiffs also ramped up their efforts. Malibu Media accounted for 41.5% of all copyright suits in the US in 2014, and just over 39% in 2015. Malibu Media is represented by Michael Keith Lipscomb of Lipscomb, Eisenberg & Baker, PL. Lipscomb also represents two of the other plaintiffs on the top five list for last year—Manny Film and Plastic The Movie Limited—as well as two of the top five from 2014—Good Man Productions, Inc. and Poplar Oaks, Inc. The filing fee for opening civil action in US district courts is now $400, so that means that plaintiffs associated with Mr Lipscomb have paid at least $936,800 in filing fees over the last year. Given the scale of this enterprise it seems reasonable to infer that Lipscomb and his clients have found a way to effectively monetize online infringement.

Table 3 Top Five Copyright John Doe Plaintiffs 2011—2015

Screen Shot 2016-01-05 at 11.05.36 AM

End of post

(1/4) 2015 Data on IP Litigation in US District Courts

Introduction to the 2015 Update

This is the first in a series of posts discussing the 2015 Update to the data in my forthcoming article, IP Litigation in United States District Courts: 1994 to 2014 (Iowa Law Review, forthcoming 2016).

My original piece, IP Litigation in United States District Courts, undertakes a broad-based empirical review of Intellectual Property (IP) litigation in United States federal district courts from 1994 to 2014. This brief update extends that data to include the year 2015. For a detailed discussion of the data sources and methods, please refer to the original article.

This update contains new data on

  1.  the overall state of copyright, patent and trademark litigation,
  2. copyright litigation and the John Doe phenomenon,
  3. the continuation of the patent litigation explosion and
  4. the geographic distribution of copyright, patent and trademark litigation.

You can read the 2015 Update in serial form in the posts that follow, or you can download the entire update as a pdf file from ssrn.com. See Suggested citation, Matthew Sag, IP Litigation in United States District Courts2015 Update (January 5, 2016). Available at SSRN: http://ssrn.com/abstract=2711326.

1. Copyright and Patent Litigation Increasing, Trademark Litigation Declining

The two most important trends in IP litigation over the last five years have been the extraordinary increase in patent and copyright litigation and the corresponding relative decline of federal trademark litigation. These trends continued in 2015.

Figure 1: Copyright, Patent and Trademark Filings 1994—2015 (Percent)

Figure 1

Twelve month moving average of percent of Federal IP litigation. Source: Administrative Office of the U.S. Courts, PACER records, 1994—2015.

Figure 1 (above) shows the relative proportions of copyright, patent and trademark cases filed, based on a 12 month moving average between 1994 and 2015. As the figure makes plain, the relative shares of copyright, patent and trademark have fluctuated quite significantly over the period. The proportion of cases in the federal IP docket did not change significantly from 2014 to 2015 (39% in 2014, 41% in 2015); however, copyright increased significantly from 32% to 37% and this was matched by a corresponding decline in trademark litigation from 29% to 22%. Complete year-by-year data from 1994 to 2015 are presented in Table 1 (below).

Table 1: Copyright, Patent and Trademark Filings 1994—2015

Screen Shot 2016-01-05 at 10.54.11 AM

The increase in copyright and patent litigation can be seen even more clearly in Figure 2 (below) which shows the raw number of cases filed for each type of IP (displayed as a 12 month moving average).

Figure 2 Copyright, Patent and Trademark Filings 1994—2015 (Cases)

Figure 2

Twelve month moving average of cases filed. Source: Administrative Office of the U.S. Courts, PACER records, 1994—2015.

End of post

Second Circuit clears the last hurdle for Google Book Search

The Second Circuit ruled today that, in its present form, the library digitization that Google began over ten years ago does not infringe US copyright law. This decision was entirely predictable given the court’s ruling in the related Hathitrust litigation, it is nonetheless momentous. Judge Leval’s cogent explanation of the law and the facts is an exemplary piece of legal writing. The decision is available here (AG v Google October 16, 2015) and merits careful reading.

This is great win for Google, but more importantly, it confirms a balanced approach to copyright law that will ultimately benefit authors, researchers, the reading public and the developers of new forms of information technology.

I have written several law review articles on the issues raised in this case — Orphan Works as Grist for the Data Mill, 27 Berkeley Technology Law Journal 2012, The Google Book Settlement and the Fair Use Counter-factual, and Copyright and Copy-Reliant Technology 103 Northwestern University Law Review 1607–1682 (2009). However, I believe that it was only when I teamed up with Matthew Jockers (a professor of English literature) and Jason Schultz (a law professor with deep experience in public interest litigation in addition to expertise in copyright) to write the amicus Brief Amicus Curiae of Digital Humanities and Law Scholars that my work truly became influential. The court did not cite the any amicus briefs in the case, but they were cited in the district court case and the related Hathitrust cases. Reading Judge Leval’s decision, I think it is clear that the excellent briefing by Google’s lawyers and the many public interest groups who contributed was helpful and influential.

This case is great victory for the public interest, it is also a great illustration of how a deep commitment to scholarship complements law school clinical programs and helps us serve the public interest.

Some cool graphs from my paper on IP litigation in US district courts

I have just revised my article, IP Litigation in US District Courts: 1994 to 2014, which will be published in Volume 101 of the Iowa Law Review next year.  (You can download the article from ssrn now.) This post does not attempt to summarize the full article; it focuses instead on explaining some of the more interesting graphs and data visualizations in the article.

Copyright, Patent and Trademark Filings as a percentage of all IP 1994-2014

This data is presented as a 12 month moving average.

Copyright, Patent and Trademark Filings 1994—2014 (Percent)

 

Copyright, Patent and Trademark Filings (number of cases) 1994—2014

Again, this data is presented as a 12 month moving average. The difference between the dashed redline and the solid red line clearly shows the impact of lawsuits against anonymous internet file sharers.

Copyright, Patent and Trademark Filings 1994—2014 (Cases)

 

Copyright Cases 1994—2014, RIAA End-User Litigation, BitTorrent Monetization and Copyright Trolling

The impact of the current wave of copyright trolling is pretty clear.

Copyright Cases Filed in U.S. District Courts (1994—2014)

 

9 out of 10 of ‘copyright trolling’ cases are about pornography

As you can see from the table, the number of john does per suit has declined because courts have been far more skeptical of mass-joinder, but that has just led to more suits being filed.

Screen Shot 2015-08-20 at 11.03.56 AM

 

One pornography company accounts for 80% of Copyright John Doe lawsuits filed in 2014 #CopyrightTrolling

In fact, the pornography producer, Malibu Media is such a prolific litigant that in 2014 it was the plaintiff in over 41.5% of all copyright suits nationwide. John Doe litigation is not a general response to Internet piracy; it is a niche entrepreneurial activity in and of itself.

[Edited at 4:17pm. The missing * for AF Holdings has been added]

 

Screen Shot 2015-08-20 at 4.15.40 PM

1/2 The patent litigation explosion is not exactly as it appears, compare suits filed to #defendants.

At first glance it looks like the annual volume of patent litigation in the United States doubled in the 16 years from 1994 until 2010. In the three years from 2010 to 2013 it doubled again.

US Patent Litigation Filings, 1994–2014

 

2/2 The patent litigation explosion is not exactly as it appears, compare suits filed to #defendants.

The real trend in patent litigation over the past two decades can be seen in the number of defendants filed against. The bar chart at the bottom of the next figure shows the same filing data as in the figure above. The scatter plot in the figure below shows the estimated number of defendants. Although it appears that the number of patent cases filed exploded after 2010, looking at the estimated number of defendants, it becomes clear that the period from 2010 to 2013 was more or less a continuation of the existing trend.

Patent Cases Filed and Estimated Number of Defendants, 1994—2014

There is something wrong with the ED of Texas. Average Number of Patent Defendants per Filing 1994—2014

This figure shows the estimated number of defendants per suit for the nine most popular federal districts from 1994 to 2014 and also for an aggregation of all other districts. The vertical dashed line is set to 2011 to mark the passage of the America Invents Act. It is starkly apparent that the trend toward more defendants is greatest in the Eastern District of Texas. The estimated number of defendants in Eastern District of Texas climbs steeply from 1.66 in 1994 to 12.37 in 2010 and then drops precipitously down to 1.99 in 2014

Average Number of Patent Defendants per Filing 1994—2014

 

What does all this mean? To me, it suggests that there was not exactly a “Troll Fueled Patent Litigation Explosion” between 2010 and 2012. Once you take into account the procedural changes brought into effect in 2011 by the AIA and focus on the number of defendants rather the the number of suits it seem that there was a significant troll fueled increase in the rate of patent litigation; it is just that this increase started earlier and proceeded more smoothly than the simple case filing data suggests. I refer to this revised narrative as the Troll Fueled Patent Litigation Inflation.

District Rankings, Copyright Compared to Trademark (2010-2014)

This figure focuses your attention on the outliers, but the general story is that copyright and trademark litigation are highly correlated at a district court level.

District Rankings, Copyright Compared to Trademark (2010-2014)

Regional Variation in Patent Litigation – Evidence of Forum Selling

The popularity of the Eastern District of Texas as a forum for patent litigation is a well-known phenomenon. However, the data and analysis presented in this study provides a new way of looking at the astonishing ascendancy of this district and the problem of form shopping in patent law more generally. The extent of forum shopping in patent law can be seen by comparing the geographic distribution of patent litigation to that of copyright and trademark. This figure illustrates District rank in terms of patent versus a combined copyright and trademark ranking for cases filed between 2010 and 2014.

District Rank in terms of Patent versus Copyright and Trademark Combined (2010-2014)

District Court Ranks for Patent Litigation 1994-2014

This is crazy!

My paper explains how we got here and summarizes the excellent work of Jonas Anderson in a new paper titled ‘Court Competition for Patent Cases, and Daniel Klerman and Greg Reilly in ‘Forum Selling’ each of which go into even more detail.

District Court Ranks for Patent Litigation 1994-2014

 

The first thing to note about this figure is that, but for the Eastern District of Texas and Delaware, the geographic distribution of patent litigation over the past two decades would look remarkably stable. For most of this period, the Central District of California was the most important venue for patent litigation over the last 21 years, followed by the Northern District of California. The Northern District of Illinois has also ranked consistently somewhere between second and sixth over the same period. This relative stability contrasts markedly with the steady gains made by Delaware and the remarkable ascendancy of the Eastern District of Texas between 1994 and 2014. Notice that, were it not for the Eastern District of Texas, the scale on Figure 11 would range from 10 to 1, rather than 50 to 1. Framed accordingly, the steady ascent of Delaware from 9th in 1994 to 2nd from 2011 to the present day would be more noteworthy. However, the rise of the Eastern District of Texas from literal obscurity—it only saw 8 patent cases in 1994—to preeminence over the same period dwarfs all other changes.

Some thoughts on Fair use, Transformative Use and Non-Expressive Use

Fair use, Transformative Use and Non-Expressive Use

Or,

Campbell v. Acuff-Rose and the Future of Digital Technologies, notes on a short presentation at the Fair Use In The Digital Age: The Ongoing Influence of Campbell v. Acuff-Rose’s “Transformative Use Test” Conference, April 17 & 18, 2015, University of Washington School of Law.

Copyright and disintermediation technologies

Copyright policy was hit by an analog wave of disintermediation technology in the post-war era and a digital wave of disintermediation technologies beginning in the 1990s. These successive waves of technology have forced us to reevaluate the foundational assumption of copyright law; that assumption being that any reproduction of the work should be seen as an exchange of value passing from the author (or copyright owner) to the consumer.

Technologies such as the photocopier and the videocassette recorder and then later the personal computer significantly destabilized copyright policy because these inventions, for the first time, placed commercially significant copying technology directly in the hands of large numbers of consumers. This challenge has only been accelerated by digitalization and the Internet. Digitalization allows for perfect reproduction such that the millionth copy of an MP3 file sounds just as good as the first copy.

The implications of the copying that these devices enabled were not clear-cut. In some cases, the new copying technology simply enabled greater flexibility in consumption, in others they generated new copies to be released into the stream of commerce as competitors with the author’s original authorized versions. The Internet has connected billions of people together leading to an outpouring of creativity and user-generativity, but from the perspective of the entertainment industry is also brought people together to undertake a massive scale piracy.

The significant of Campbell v. Acuff-Rose Music, Inc., 510 U.S. 569 (1994)

The Supreme Court in Sony v. Universal[1] had already shown that it was willing to apply fair use in a flexible manner in situations where the use was personal and immaterial to the copyright owner. The significance of the Court’s decision in Campbell[2] was that, by reorienting the fair use doctrine around the concept of transformative use, the Court prepared the way for a flexible consideration of technical acts of reproduction that do not have the usual copyright significance.

Internet search engines, plagiarism detection software, text mining software and other copy-reliant technologies do not read, understand, or enjoy copyrighted works, nor do they deliver these works directly to the public.  They do, however, necessarily copy them in order to process them as grist for the mill, raw materials that feed various algorithms and indices. Campbell arrived just in time to provide a legal framework far more hospitable to copy-reliant technology than had previously existed. Even in its broadest sense, transformative use is not the be all and end all of fair use. At the risk of over-simplification, Sony v. Universal safeguarded the future of the mp3 player, whereas Campbell secured the future of the Internet and reading machines.

Copy-reliant technology and non-expressive use

Some of the most important recent technological fair use cases can be summarized as follows: Copying that occurs as an intermediate technical step in the production of a non-infringing end product is a ‘non-expressive’ use and thus ordinarily constitutes fair use.[3] The main examples of non-expressive use I have in mind are the construction of search engine indices,[4] the operation of plagiarism detection software[5] and, most recently, library digitization to make paper books text-searchable.[6]

To have a coherent concept of fair use, or any particular category of fair use, one needs a coherent concept of copyright. As expressed in the U.S. Constitution, copyright’s motivating purpose is “to promote the Progress of Science and useful Arts.”[7] Ever since the Statute of Anne in 1710, the purpose of Copyright law has been to encourage the creativity of authors and to promote the creation and dissemination of works of authorship. Copyright is not a guarantee of total control; in general, the copyright owner’s rights are limited and defined in reference to the communication of the expressive aspects of the work to the public. This is evident in the idea-expression distinction, the way courts determine whether two works are substantial similar and the focus of fair use cases on expressive substitution. Thus, subsequent authors may not compete with the copyright owner by offering her original expression to the public as a substitute for the copyright owner’s work, but they are free to compete with their own expression of the same facts, concepts and ideas. They are also free to expose, criticize and even vilify the original work. Genuine parodies, critiques and illustrative uses are fair use so long as the copying they partake in is reasonable in light of those purposes.

If public communication and expressive substitution are rightly understood as copyright’s basic organizing principles, then it follows that non-expressive uses — i.e., uses that involve copying, but don’t communicate the expressive aspects of the work to be read or otherwise enjoyed — must be fair use. In fact, they are arguably the purest essence of fair use. Groking the concept of non-expressive use simply involves taking the well understood distinction between expressive and nonexpressive works and making the same distinction in relation to potential acts of infringement.

The legal status of actual copying for nonexpressive uses was not a burning issue before digital technology. Outside the context of reading machines like search engines, plagiarism software and the like, courts have quite reasonably presumed that every copy of an expressive work is for an expressive purpose. But this assumption no longer holds. At a minimum, preserving the functional force of the idea-expression distinction in the digital context requires that copying for purely non-expressive purposes, such as the automated extraction of data, should not be infringing.

Some limits to the non-expressive use framework

Non-expressive use is a sufficient but not necessary condition of fair use. For example, parody is an expressive use, but it is fair use because it does not tend to threaten expressive substation. Even within the realm of recent technology cases, non-expressive use is not the right framework for addressing important man-machine interaction questions such as disability access, also a key issue in the HathiTrust litigation, but it does tie together a number of disparate threads.

The cases which hold that software reverse engineering is fair use are grounded firmly in the idea-expression distinction,[8] but they are not exactly non-expressive use cases for the reasons that follow.[9] The non-expressive use framework is also not the right tool in cases where software is copied in order to access its functionality: after-all, software is primarily functional and its primary (perhaps exclusive) value comes from the function it performs. Software piracy can’t be justified as a non-expressive use, because to do so would defeat the statutory scheme wherein Congress chose to graft computer software protection onto copyright. However, the reverse engineering cases still follow the logic of non-expressive use. In those cases copying to access certain API’s and other unprotectable elements enabled the copyists to either independently recreate that functionality (akin to conveying the same ideas with different expression) or to develop programs or machines that would complement the original software.

Non-expressive use versus transformative use?

The main issue left to resolve in terms of the copy-reliant technology and non-expressive use seems to be one of nomenclature. Is non-expressive use simply a subset of transformative use? Or is it a separate species of fair use with similar implications to that of transformative use.

Non-expressive use, as I have defined and elucidated in a series of law review articles and amicus briefs, is a clear coherent concept that ties a broad set of fair use cases directly to one of copyright’s core principles, the idea-expression distinction. Transformative use, as explained by Pierre Leval and adopted by the Supreme Court is rooted in the constitutional imperative for copyright protection – the creation of new works and the promotion of progress in culture, learning, science and knowledge. But for all that, if transformative use is invoked as an umbrella term, it is often hard to see what holds the category together.

The Campbell Court did not posit transformative use as a unified, exhaustive theory, but it did say that “[a]lthough such transformative use is not absolutely necessary for a finding of fair use, the goal of copyright, to promote science and the arts, is generally furthered by the creation of transformative works. Such works thus lie at the heart of the fair use doctrine’s guarantee of breathing space within the confines of copyright, …”[10] No doubt, when the Supreme Court spoke of transformative use, it had various communicative and expressive uses, such as parody, the right of reply, public comment and criticism in mind. But since Campbell, lower courts have applied the same purposive interpretation of copyright to a broader set of challenges. Campbell was decided in a different technological context and it is true that many of today’s technological fair use issues were entirely unimaginable before the birth of the World Wide Web and our modern era of big data, cloud computing, social media, mobile connectivity and the “Internet of Things”.

Non-expressive use is a useful concept because it provides a way for courts to recognize the legitimacy of copying that is inconsequential in terms of expressive substitution, but does not necessarily lead to the creation of the type of new expression that the Supreme Court had in mind in Campbell. The use of reading machines in digital humanities research is easy to justify, both in terms of the lack of expressive substitution and in the obvious production of meaning, new insights and potentially new and utterly transformative works of authorship. But what of less generative non-expressive uses? For example, in the future a robot might ‘read’ a copyrighted poster on a subway wall advertising a rock concert in Central Park. The robot might then ‘decide’ to change its travel plans in light of the predictable disruption. The acts of ‘reading’ and ‘deciding’ are both simply computational. Even if reading involves making a copy of the work inside the brain of a machine, it seems nonsensical to conclude that the robot was used to infringe copyright. In the age of the printing press, copying a work had clear and obvious implications. Copying was invariably for expressive ends and it was almost always the point of exchange of value between author and reader. The copyright implications of copying are much more contingent in the digital age.

There is much clarity to be gained by talking directly in terms of non-expressive use rather than relying on transformative as broad umbrella for a range of expressive and non-expressive fair uses. Such clear thinking would hopefully ease the anxieties of the entertainment industry that still fears that fair use is simply a stalking horse for dismantling copyright. Nonetheless, it would not be surprising if courts were more comfortable sticking with the language of transformativeness that Judge Pierre Leval gave us in “Toward a Fair Use Standard“,[11] and the Supreme Court adopted in Campbell.

This is a sketch of some ideas, no doubt revisions will follow after this exciting conference.

Related Publications:

Matthew Sag, Copyright and Copy-Reliant Technology 103 Northwestern University Law Review 1607–1682 (2009)

Matthew Sag, Orphan Works as Grist for the Data Mill, 27 Berkeley Technology Law Journal 1503–1550 (2012)

Matthew Jockers, Matthew Sag & Jason Schultz, Digital Archives: Don’t Let Copyright Block Data Mining, 490 Nature 29-30 (October 4, 2012)

Somewhat Related Publications:

Peter DiCola & Matthew Sag, An Information-Gathering Approach to Copyright Policy, 34 Cardozo Law Review 173–247 (2012)

Matthew Sag, Predicting Fair Use 73 Ohio State Law Journal 47–91 (2012)

Matthew Sag, The Pre-History of Fair Use 76 Brooklyn Law Review 1371–1412 (2011)

 

[1] Sony Corp. of America v. Universal City Studios, Inc., 464 U.S. 417 (1984).

[2] Campbell v. Acuff-Rose Music, Inc., 510 U.S. 569 (1994).

[3] See generally, Matthew Sag, Copyright and Copy-Reliant Technology 103 Northwestern University Law Review 1607–1682 (2009)

[4] There is no case addressing the legality of the process of making a text-based search index (as opposed to caching or display of search results), but the proposition naturally flows from Kelly v. Arriba Soft Corp., 336 F.3d 811 (9th Cir. 2003) and Perfect 10, Inc. v. Amazon.com, Inc., 508 F.3d 1146 (9th Cir. 2007) and is a necessary implication of Authors Guild, Inc. v. Hathitrust, Court of Appeals, 2nd Circuit 2014 and Authors Guild, Inc. v. Google Inc., 954 F. Supp. 2d 282 (S.D.N.Y. 2013)

[5] A.V. ex rel. Vanderhye v. iParadigms, LLC, 562 F.3d 630 (4th Cir. 2009).

[6] Authors Guild, Inc. v. Hathitrust, Court of Appeals, 2nd Circuit 2014; Authors Guild, Inc. v. Google Inc., 954 F. Supp. 2d 282 (S.D.N.Y. 2013). See also Matthew Sag, Orphan Works as Grist for the Data Mill, 27 Berkeley Technology Law Journal 1503–1550 (2012); Matthew Jockers, Matthew Sag & Jason Schultz, Digital Archives: Don’t Let Copyright Block Data Mining, 490 Nature 29-30 (October 4, 2012).

[7] U.S. Const. art. I, § 8, cl. 8.

[8] Sega Enter. Ltd. v. Accolade, Inc., 977 F.2d 1510 (9th Cir. 1992); Sony Computer Entm’t, Inc. v. Connectix Corp., 203 F.3d 596, 606 (9th Cir. 2000).

[9] These reasons are more fully elaborated in Matthew Sag, Copyright and Copy-Reliant Technology 103 Northwestern University Law Review 1607–1682 (2009).

[10] Campbell v. Acuff-Rose Music, Inc., 510 U.S. 569, 579 (1994)(citation omitted).

[11] 103 Harv. L. Rev. 1105 (1990)