The order is here. It is likely that the rest of the case will be put on hold while this question is addressed, and yet the Hathitrust litigation is rolling on. The Authors Guild, which represents only 8500 authors, is trying to cash a billion dollar check (they hope) on behalf of a class of millions. Class action lawyers and copyright lawyers will be watching this case very closely.
A new study by Patricia Aufderheide, Peter Jaszi, Katie Bieze and Jan Lauren Boyles, explores the problems that journalists face engaging with copyright and employing the doctrine of fair use. The study was based on open-ended interviews with 80 journalists across a range of media platforms.
The good news is that in situations that journalists have been dealing with for years, their collective intuitions about good practice map pretty closely onto the fair use case law. This is striking because the journalists appear to know very little about copyright law of the ins and outs of fair use. Sometimes the journalists quoted in the study were comically wrong:
“When somebody dies, … it’s in the public domain” … “If you can find it on the Web, then anybody can use it, and anybody can take it.”
So how do journalists get it right? The journalists mission to report the news, space constraints, norms of attribution and originality all lead journalists to seek to use their source material transformatively and limit that use to what is necessary. As the authors note:
“They routinely asked themselves if they were merely appropriating information in order to avoid work, or whether they were repurposing that information in a way matched to their mission to inform the public.”
More troubling is finding that in new media situations journalists did not understand their fair use rights. In this context,
“interviewees were often unable to make a timely decision or justify it to a gatekeeper. They operated from risk analysis, without knowledge of actual risk or of their actual rights.”
The study finds when in doubt, journalists routinely self-censor, causing delays, increasing costs, and even failing their journalistic mission.
There is an excellent summary of the district court’s decision from May 2012 over at James Grimmelmann’s blog.
The abbreviated story is that in April 2008 three publishers (Cambridge University Press, Oxford University Press, and SAGE Publishers) brought suit against Georgia State in relation to the school’s electronic reserve policy. The suit was backed by the Association of American Publishers and the Copyright Clearance Center (CCC), a licensing company.
This was a complicated case, not least because the publishers had a surprisingly hard time showing they owned the relevant copyrights. Judge Orinda Evans of the U.S. District Court in Atlanta handed down a 350 page decision in May. Of all the books the plaintiffs said were infringed, many did not even make it to a fair use analysis because the only copying that had taken place was to confirm the files were on reserve.
In terms of fair use, the highlights of the decision were:
(1) The educational purpose of the use favored the defendant.
(2) The informational nature of the use favored the defendant.
(3) The third fair use factor, the amount and substantiality of the portion copied, turned out to be quite interesting.
- The court rejected the Classroom Guidelines as incompatible “with the language and intent of § 107” and suggested its own quantitative test: 10% of the total page count for books of nine chapters or less and one chapter for longer books. Going above this limit is not fatal to the defendant, but staying below is highly favorable.
(4) In terms of the fourth factor, the effect of the defendant’s use on the on the market for/value of the plaintiff’s work, the court found that this favored the plaintiffs digital licensing was available through the CCC.
- But in the majority of cases the 4th factor still favored Georgia State because there was “no evidence in the record to show that digital excerpts from this book were available for licensing” as of the date of infringement.” Note that photocopying licenses were not seen to be a close substitute for digital reserve licenses. Another important piece of context here is that students would not have bought the assigned books as a substitute for the excerpts posted on the e-reserve system. Thus, no harm, no foul. Consistent with a “market failure” analysis, in the context of orphan works this suggests a broad scope for fair use.
Only five out of 74 of course reserve listings fell on the wrong side of this fair use analysis
The Update – No Injunction
As reported in the Chronicle of Higher Education, on Friday afternoon, Judge Evans issued an order denying the plaintiffs’ request for injunctive and declaratory relief. The only remedy the publishers got was an order that the defendant’s fine-tune their copyright policy to make it “not inconsistent” with the Judge’s ruling.
The Judge determined that Georgia State University was, on balance, the prevailing party (they won 69:5 after all), and was thus entitled to “reasonable attorney’s fees”.
This is a massive vindication for Georgia State University, the institution was depicted as copyright outlaw by the plaintiffs, but the court was “convinced that defendants did try to comply with the copyright laws,” and mostly succeeded. It is also further real world evidence that with the right legal advice, fair use can be somewhat predicable. See my article on Predicting Fair Use for an empirical study to this effect.
In “The Orphans, the Market, and the Copyright Dogma” Ariel Katz notes that extended collective licensing (ECL) proposals will do nothing to solve the underlying orphan works problem. Like “Indulgences” ECL solutions merely absolves the “sin” of using works without permission, but actually does nothing to pay the absent owners.
In “How Fair Use Can Help Solve the Orphan Works Problem” Jennifer Urban does a great job of explaining how the rest of us have under-analyzed the second fair use factor in relation to library digitization. She points out that in the Senate Report on the 1976 Copyright Act they say directly that market availability is part of the nature of the work.
In my own paper “Orphan Works as Grist for the Data Mill” I explain why copyright does not stand in the way of nonexpressive uses. My argument is that just as the distinction between expressive and nonexpressive works is well recognized. The same distinction should generally be made in relation to potential acts of infringement.
Copying for purely nonexpressive purposes, such as the automated extraction of data, should not be regarded as infringing. Automated reproduction for nonexpressive uses (such as search engines, plagiarism detection, and macro-literary analysis) does not communicate the author’s original expression to the public, there is no expressive substitution, and thus there is no infringement. For more on Copyright and Copy-Reliant Technology, read my 2009 article of the same name.
Now that the Google Book Settlement is well and truly dead, attention is turning back to the underlying legal controversy. There are many issues in Authors Guild v. Google and the parallel case of Authors Guild v. HathiTrust, but the main one is simple. Does copying books so that computers can analyze them infringe copyright even if none ever reads that copy?
If the answer is yes, then, through the magic of class action law, the Authors Guild gets to sue Google for a minimum of $750 x several million books. Who would get these billions of dollars is unclear.
If the answer is no, then the Authors Guild would have to point to instances where Google has made a nontrivial portion of a book available to the public without permission of justification such as fair use. There might be one or two of these, but I think Google won’t loose sleep about statutory damages for a handful of books.
I recently wrote an amicus brief, along with Matthew Jockers (Assistant Professor of English at the University of Nebraska, Lincoln) and Jason Schultz (Assistant Clinical Professor of Law; Faculty Co-Director, Samuelson Law, Technology & Public Policy Clinic), arguing that such non-expressive is use fair use. I.e., that text-mining is not copyright infringement.
More than 60 professors and researchers in the digital humanities joined our brief because, as we said:
“If libraries, research universities, non-profit organizations, and commercial entities like Google are prohibited from making non-expressive use of copyrighted material, literary scholars, historians, and other humanists are destined to become 19th-centuryists; slaves not to history, but to the public domain. History does not end in 1923. But if copyright law prevents Digital Humanities scholars from using more recent materials, that is the effective end date of the work these scholars can do.”
This is what is at stake.
Counsel for the Authors Guild have asked the court to deny our motion for leave to participate as amici in the case of Authors Guild v. Google.
On Friday August 3, 2012, the Association for Computers and the Humanities and a group of 64 scholars from disciplines including law, computer science, linguistics, history and literature filed an amicus brief on behalf of the Digital Humanities urging the court in Authors Guild v. Google to grant summary judgment in favor of the defendant.
In its 10 page memorandum in opposition the Guild argues that “It is inappropriate for these entities to inject themselves into private litigation.” This seems a bit rich given that the Authors Guild, a group of some 8500 authors, is trying to assert the right to say no to digitization of over 20 million books. That is leverage on a ratio of more than 2000:1. The Guild is trying to set a legal precedent that would render text-mining without individual permission in any context unlawful. Digital Humanities scholars should not be relegated to studying literature prior to 1923.
This case is not a private arbitration, it will establish an important precedent that either confirms the legitimacy of search engine technology, plagiarism detection software and computerized analysis of text.
The Guild says that our brief simply argues Google’s case and does not have anything to add. Yet at the same time they complain that the digital humanities scholars seek to inform the court about “text-mining and computation analysis”.
The Guild also argues that our legal argument that non-expressive use should be fair use is really just a disguised expert opinion. No doubt, if I was deposed as an expert witness they would complain that my views were just legal argument in disguise.
The digital humanities brief is not, as the Guild contends, asking for an advisory opinion. The brief alerts the court to the important implications of its ruling and highlights what the Authors Guild tries to obfuscate, that this case is much bigger than Google, it’s about the future of humanities scholarship.
The Author’s Guild argues that it is not fair that the digital humanities scholars (and another brief filed by the American Library Association) will each add another 26 pages to their workload. 1056 documents have been filed in this case! It is hard to see the burden of another 26 pages.
Judge Deny Chin (Southern District of New York) is scheduled to hear the parties’ motions for summary judgment on October 9, 2012.
Rebecca Tushnet had some interesting things to say at IPSC today about the multiple meanings of Barbie®. She makes an excellent point that trademark defenses will only be accessible to people without lawyers (i.e., most people) if they are based on balancing tests or a rule of reason analysis (like copyright fair use) and the current maze of factors, hurdles and hierarchies. See http://en.wikipedia.org/wiki/Barbie#Parodies_and_lawsuits for a summary of several Barbie related copyright and trademark controversies.