The Multiple Copy Argument – Some thoughts on #fairuse and Authors Guild v. Hathitrust (pt4)

Introduction and Necessary Disclaimer 

This one of a series of posts concerning the Authors Guild v. Hathitrust case, specifically these posts take the form of commentary on the Authors Guild Appeal Brief (February 25, 2013). The views expressed on this site are purely my own.

Today’s topic …

The Multiple Copy Argument

The Authors Guild Appeal Brief contains an interesting argument that is hard to summarize with perfect fidelity because it appears in so many places throughout the document (illustrations to follow). Essentially the plaintiffs now appear to argue that even if some copying would be allowed for certain library digitization purposes, the defendants created a too many copies and that these copies, or their retention, exceed the parameters of any fair use claim.

Examples from the Authors Guild Appeal Brief

The multiple copy argument first appears in the plaintiffs’ Statement of issues presented

“3. Did the District Court err by failing to recognize that the Libraries’ online storage of multiple copies of the unauthorized digital library goes far beyond what is necessary to accomplish any transformative purpose of the MDP?” (Authors Guild Ap. Br. page 4)

However it also appears on pages 8, 9-10, 12, 18, 30, 31, 32, 33, 36, 37 and 38.

“Each digital replica would include a set of image files representing every page of the work and a text file of the book’s words generated through an optical character recognition process.” (Authors Guild Ap. Br. page 8)

“the Libraries receive their own digital copies of the works to store and use.”(Authors Guild Ap. Br. page 8)

“In addition to the copies retained by Google, four digital copies of each book are maintained in the HDL, with two such copies stored on servers located in Michigan and Indiana and two additional copies stored on backup tapes.” (Authors Guild Ap. Br. 9-10)

“Moreover, even if certain of the Libraries’ uses are deemed transformative, their online storage of multiple digital duplicates of the books goes far beyond what is necessary to fulfill that purpose.” (Authors Guild Ap. Br. 12)

“[I]n analyzing whether the Mass Digitization Program is fair use under Section 107, the District Court failed to consider whether the Libraries could have made the uses the court found to be transformative – facilitating search and access for the print-disabled – without keeping multiple copies of the Authors’ works online and subjecting them to unauthorized access and widespread distribution.” (Authors Guild Ap. Br. 18)

“Moreover, to the extent that there is any transformative or other legitimate purpose to the Libraries’ actions, the making of multiple copies of the works and then storing the full text and image files online where they are susceptible to theft and widespread distribution goes far beyond what is needed to satisfy such purpose.” (Authors Guild Ap. Br. 30)

“the District Court erred by failing to recognize that the Libraries are able to facilitate text searching and to provide access to the print-disabled without creating and storing so many digital copies online.” (Authors Guild Ap. Br. page 31)

“(ii) Even if Copying Millions of Books to Facilitate Search is Transformative, There is No Justification for Storing Multiple Copies of the Image and Text Files Online” …”The Authors maintain, as they did below, that the Libraries have no right to copy and use millions of books without authorization or payment. If the Libraries want to scan print books in order to create indices or to facilitate text mining or other research tools, they should be required to ask for and obtain permission for their copying. But more importantly for purposes of this appeal, to the extent that any of the Libraries’ goals fit within the rubric of fair use, the Libraries should be permitted to do no more than is necessary to accomplish that particular purpose.” (Authors Guild Ap. Br. 32)

“Moreover, unlike HathiTrust’s perpetual storage of high resolution image files and text files of every book, the Web pages copied by a search engine are incidental to the search function.” (Authors Guild Ap. Br. page 33)

“[O]nce a book’s text is recorded in the index, the image and text files are no longer necessary for the operation of the search engine.” (Authors Guild Ap. Br. page 37)

“[E]ven if it is necessary to digitize an entire work in order to index the contents for facilitating search, the third factor weighs heavily against the Libraries because they are unnecessarily retaining complete image and text files comprising every page of every book.” (Authors Guild Ap. Br. page 36)

Some thoughts on the Multiple Copy Argument

It entirely plausible that a plaintiff might look at a defendant who has made lots and lots of copies and argue that the very multiplicity of the copying is evidence that the real purpose was not the transformative use claimed, but some other use. For example, if Borders (1971-2011) had scanned its whole inventory and made 60,000 copies of the collection in dvd bundles, we might have begun to suspect they were planning on selling them.

However, in the context of the library digitization being litigated in Authors Guild v. HathiTrust, there is no similar mystery about the extent of copying. The libraries maintain the original scan images because those images are needed to quality-check the OCR (optical character recognition) text versions. Those versions are also needed so that the collection can be re-digitizes when, inevitably, someone invents a smarter OCR program that is less prone to error. A biologist would not throw out an original specimen after taking their initial notes; a social scientist would not delete her original data after running her initial set of regressions. It would be somewhere between reckless and crazy to throw out the original scans.

The same applies to the OCR-text files. It might be true that once you create a search index you don’t need the original text files to actually implement search. But as anyone with any experience in software development or working with data will tell you, there are always new and better ways to process information. It would be hubris, almost a crime against knowledge, to pretend that search indexing or optical character recognition in 2013 are a good as they will ever be.

The Authors Guild Appeal Brief appears (to me) to be deliberately obtuse when it says “… even if this Court were to hold that HathiTrust in its current configuration satisfies these criteria, the Libraries still have not demonstrated their need to retain the digital image files in order to facilitate access to the print-disabled, as the assistive technology uses text files to convert the text from the book into speech.” (Authors Guild Ap. Br. page 38). Does the Authors Guild seriously intend that the print-disabled should be held hostage to state of the art in OCR and text-to-speech as of 2013?

Any library digitization exercise should generate a handful of copies per book – you have to keep the original image and OCR files safe; you have to duplicate them so people can examine them; you have to store everything in multiple locations in case of flood, fire, terrorist attack or simple human error, and if scientists are regularly testing new equations against the original data you might need to mirror some of that data to increase the speed of the network. There is no reason why the universities should treat these digitized files any more cavalierly than Facebook treats the 267 photos of my dog I have posted to the social network.

Trove and the Australian National Library’s risk management approach to orphan works

On Tuesday I posted The Authors Guild, orphan works and civil rights? (Authors Guild v Hathitrust pt. 3) in which I addressed the arguments made by on behalf of the plaintiffs on appeal in Authors Guild v. Hathitrust. The Authors Guild takes the rather extreme position that:

“Any iteration of the OWP [orphan works program] under which copyrighted works are made available for public view and download violates the Copyright Act.” (Authors Guild Ap. Br. page 13; see generally pages 13).

Their appeal brief poses the question:

“Is it ever lawful to take an entire copyright-protected book and make it widely available for display and download without permission?” (Authors Guild Ap. Br. page 13).

I believe that the answer to that question is YES. On Tuesday I gave an example of an individual orphan work made accessible on the  Civil Rights Movement Veterans Website. Today I want to extend that discussion to an entire orphan works program. 

Trove, The Australian National Library and Orphan Works

Trove is the Australian National Library’s primary vehicle to assist users to access digital content held by collecting institutions across Australia. Trove is used by tens of thousands of Australians every day.

In July 2008, Trove opened up Australian newspaper articles published from the 1800s to 1955 to full-text searching. Screen Shot 2013-05-01 at 11.00.24 AM

Screen Shot 2013-05-01 at 10.59.49 AM

Trove goes beyond the 1955 by agreement with newspaper publishers, but for anything prior to 1955 the NLA and the libraries in its network work on the assumption that there is no requirement to obtain permission. (See e.g.,  Selection Policy ( “The newspapers must not have copyright restrictions i.e. anything before 1955 is suitable”).

An implicit orphan works policy

In selecting 1955 as the cut-off date, the NLA has adopted what they would call a sensible risk management policy and I would call an orphan works policy. Under Australian copyright law (Australian Copyright Act 1968) the date on which the copyright in a literary work expires depends on the date of publication and the date of the death of the author.

  • If a literary work was published in the lifetime of the author, and that author died before January1, 1957, the work is out of copyright.
  • Any literary work published in the lifetime of an author who died on or after January1, 1957 and before 2005, will be out of copyright 50 years after that author’s death.
  • If a work was first published anonymously and the identity of the author cannot be ascertained on reasonable inquiry, the period of copyright protection is measured from the year of publication and not the year of the author’s death. (See Section 34 of the Australian Copyright Act 1968).
  • The law relating to photographs in Australia is a little easier: any photograph taken before 1955 is in the public domain. (See Section 33 of the Australian Copyright Act 1968).

Newspapers contain works by many different authors. For each individual article in a newspaper, the period of copyright protection is measured from the death of the author, even if the author assigned the copyright to the publisher.

What does all this mean for library digitization?

In 2013 the odds are pretty good that anything published in, say 1950, is in the public domain in Australia. But, if a work was published in 1953 and the author died in 1973, then the copyright would not expire until 2023.

Before he retired in 2011 Warwick Cathro was the Assistant Director-General, Resource Sharing and Innovation at the NLA. Warwick was a pioneer in the delivery of innovative network services to the Australian library community and is considered the founder of Trove. I spoke to Warwick about the NLA’s approach to newspaper digitization and he said:

“The NLA thus took a “risk management” approach to copyright issues in its newspaper digitization program.

We did this because of the manifest public benefit in digitising this content. We never attempted to clear copyright in individual articles; how could we ever do this for tens of millions of articles?

To my knowledge, in the five years since this content has been made available online, not one copyright owner has objected. If any were to do so the NLA would discuss the purpose of its digitization program and seek permission to include the creator’s work in the newspaper database. If this could not be negotiated the NLA would take down the item or article in question.”

Of course, it is much easier to get your lawyers to sign off on this kind of sensible risk management approach that respects the wishes of authors and maximizes public access to knowledge in a jurisdiction without statutory damages.

Orphan works projects are not just a stalking horse for Silicon Valley internet companies, nor are they simply the whimsical playthings of obscure institutions happy to work in legal grey areas. Making orphan works available to the public should be one of the core missions of American libraries. Libraries could pursue this mission more easily if statutory damages were abolished and pragmatic risk management prevailed over the misinformed notion that the purpose of copyright is prevent unauthorized use solely for the sake of prevention.

The Authors Guild, orphan works and civil rights? (Authors Guild v Hathitrust pt. 3)

Introduction and Necessary Disclaimer 

This one of a series of posts concerning the Authors Guild v. Hathitrust case, specifically these posts take the form of commentary on the Authors Guild Appeal Brief (February 25, 2013). Although I am one of the authors of the Digital Humanities and Law Scholars Amicus Brief, the views expressed on this site are purely my own. My comments on the Authors Guild’s Appeal Brief will not be comprehensive, rather, my aim is to review the aspects of the brief that I found interesting.

Today’s topic …

What is the Authors Guild really saying about orphan works?

In some ways, the Authors Guild is the victim of its own success. The Authors Guild was quick to discover some defects in the way that the University of Michigan was determining orphan works status when the project was first announced in 2011. Exposure of those issues led to the suspension of that project before any single work was distributed to the public as an orphan work. The orphan works project might come back in some form at some stage, but at the moment there is no way for the court to know what kind of orphan works project it was being asked to rule on or who it would effect.

In its appeal brief, the Guild responds to this predicament by arguing that the orphan works part of its case is ripe for adjudication because the details simply don’t matter – any orphan works project would be unlawful! See e.g.

“Any iteration of the OWP under which copyrighted works are made available for public view and download violates the Copyright Act. The pure legal question that was presented to the District Court is the same as it will always be: Is it ever lawful to take an entire copyright-protected book and make it widely available for display and download without permission?” (Authors Guild Ap. Br. page 13; see generally pages 13-14).

And later

“Plainly, existing copyright law does not permit the copying and distribution of the entirety of copyright-protected works to tens of thousands of users, irrespective of whether it might be difficult to locate the rights-holder.” (Authors Guild Ap. Br. page 17)

I don’t know how the defendants will respond to this argument and it is not an issue that fits within the scope of the Digital Humanities Amicus brief. Rather than diving into the legal arguments as to when and why the display of orphan works would be fair use, I thought it might be illuminating to consider an example.

Orphan works example: the Civil Rights Movement Veterans Website

On April 12, 2012, I attended the opening session of the Berkeley Law School’s “Orphan works and Mass Digitization” conference. The topic of the first panel was “Who wants to make use of orphan works and why.” In the course of that panel, Bruce Hartford, the webmaster of the Civil Rights Movement Veterans Website told a story so fascinating it is worth setting in full.

The Civil Rights Movement Veterans Website recounts the history of the civil rights movement:

“This website is created by Veterans of the Southern Freedom Movement (1951-1968). It is where we tell it like it was, the way we lived it, the way we saw it, the way we still see it. With a few minor exceptions, everything on this site was written, created, or spoken by Movement activists who were direct participants in the events they chronicle.” (

Much of the material on the Civil Rights Movement Veterans website is used with permission or requires no permission because it is in the public domain. However, according to Hartford, that still leaves a significant proportion of material that he would classify as orphan works. When Hartford uses the term orphan works he means (i) material that was originally copyrighted by an organization which no longer exists and made no provision for its copyrights upon dissolution; (ii) material where the copyright owner cannot be found; (iii) or material where the identity of the copyright owner was always unknown.

The photo below of James Forman (October 4, 1928 – January 10, 2005), an American Civil Rights leader active in the Student Nonviolent Coordinating Committee.

foreman copy

As Hartford described it:

“The camera was smuggled into the jail, given to an unknown prisoner who clicked the button and took the picture. Under copyright law, as I am told, the copyright to the picture is owned by the unknown prisoner who pressed the button on the camera, who then gave it back to whoever smuggled the camera into the prison, to smuggle it out of the prison.


Now I know this is off topic, but I am just going to say, some of us are a little annoyed about this stupid rule that the person who presses the button totally owns the rights and those of us who are risking our lives to do whatever it was that they were taking the picture of have no say so in whatever happens to that and they can make lots of money on it and we can look and weep.”

Take another look

Take another look at the photo of James Forman, consider what it means to the Civil Rights Movement Veterans Website and ask yourself, can it really be true, as the Authors Guild state in their brief, that “[p]lainly, existing copyright law does not permit the copying and distribution of the entirety of copyright-protected works to tens of thousands of users, irrespective of whether it might be difficult to locate the rights-holder.” (Authors Guild Ap. Br. page 17)?

Not everything is the same as everything else – Authors Guild v Hathitrust (pt. 2)

Introduction and Necessary Disclaimer 

This one of a series of posts concerning the Authors Guild v. Hathitrust case, specifically these posts take the form of commentary on the Authors Guild Appeal Brief (February 25, 2013). Although I am one of the authors of the Digital Humanities and Law Scholars Amicus Brief, the views expressed on this site are purely my own. My comments on the Authors Guild’s Appeal Brief will not be comprehensive, rather, my aim is to review the aspects of the brief that I found interesting.

Today’s topic …

Not everything is the same as everything else 

Legal argument is art of analogizing and distinguishing, drawing out the implications of things already decided in ways that suggest the a favorable outcome for matters still in dispute. Thus, in copyright cases it is quite common to read that x (new thing) is the same as/totally different from y (old thing). The Authors Guild’s brief engages in quite a bit of this kind of argument, but mostly without saying so explicitly. In particular, their brief contains three examples of false equivalence that simply don’t add up.

  1. The Authors Guild implicitly suggests that the defendants’ orphan works project is the same as the Authors Guild’s own proposal to deal with orphan works in Google Book Search Settlement. It isn’t.
  2. The Authors Guild argues that the defendants’ orphan works project is a substitute for orphan works legislation. It isn’t.
  3. The Authors Guild brief proceeds as thought library digitization were the same as library photocopying. It isn’t.

The Universities’ Orphan Works Project v. the Google Book Search Settlement

Most of the Authors Guild’s ink is spilt on the universities’ proposed orphan works project (OWP). The idea behind the defendants’ OWP appears to be that out-of-print books published in the U.S. between 1923 and 1963 should be made available for educational use if the rights holders cannot be reasonably be located. The University of Michigan proposed a method to automate the identification of orphan works for this purpose in 2011. However, the exact nature of this particular project is still yet to determined because after the Authors Guild filed suit against the HathiTrust et al, the University of Michigan announced that the OWP would be temporarily suspended. The University of Michigan candidly admitted that the procedures used to identify orphan works had allowed some works to make their way onto the Orphan Works Lists in error.

The Authors Guild Appeal Brief contains the implicit suggestion that the defendants’ OWP is the same as the audacious exploitation of orphan works that the Authors Guild itself proposed under its Settlement Agreement with Google.

It is true that, as noted at page 10 of the Guild’s Appeal Brief, “a mechanism to help resolve the orphan works issue was one of the key aspects of the attempted settlement of the Google Books case”.

It is also undeniable that Judge Chin commented “the establishment of a mechanism for exploiting unclaimed books is a matter better suited for Congress than this Court”. (Authors Guild v. Google, Inc., 770 F. Supp. 2d 666 (S.D.N.Y. 2011))

But Judge Chin was evaluating the fairness of the private settlement between Google and the Authors Guild, he was not commenting on the question of whether the display of any orphan works under any circumstance could be fair use, nor was he reviewing anything remotely like the libraries much more limited orphan works program.

The Authors Guild proceeds as though the modest orphan works program announced by the university defendants is the same in substance as the universal bookstore rejected by the Judge Chin in 2011. (See e.g., Authors Guild, page 10 “Unhappy with Judge Chin’s decision, [University of Michigan] decided to take the law into its own hands by unilaterally initiating its own program.”) This strikes me as false equivalence.

Under the default settings of the now defunct settlement (proposed 2008, amended 2009, rejected 2011) Google would have been allowed to display up to 20% of a non-fiction work to the entire world and to sell books through consumer purchases and institutional subscriptions. Funds from the sale of orphan works were to held by a ‘book rights registry’ for safe keeping and eventual distribution to worthy causes. [Under the original Settlement Agreement, the revenues attributable to orphan or unclaimed works would have flowed in part to the ‘book rights registry’ and in part to registered authors and publishers.]

The details of the OWP that the defendants may or may not eventually undertake are unclear, but their public statements indicate that any such project would be grounded on non-commercial, limited, educational use. Moreover, the settlement would have treated all books whose copyright owners who failed to notify the registry of their interests as orphan works, the University of Michigan is working on a method to reliably determine a much smaller subset of true orphan works.

Whatever it turns out to be, the Universities’ orphan works project will not be the same as the Authors Guild’s own proposal to deal with orphan works in Google Book Search Settlement.

The Universities’ Orphan Works Project v. Orphan Works Legislation

The Authors Guild Appeal Brief also conflates the universities’ OWP with various legislative solutions that have been proposed over the years in relation to the widely recognized orphan works problem. See for example Authors Guild Ap. Br. at page 15 “Despite clear indications by courts and the Copyright Office that the treatment of orphan works should be left to Congress, the Libraries insist that the OWP is legal.” (There is another example on page 10).

Does it really make sense that Congress’ failure to comprehensively or partially legislate a solution to the problem of orphan works means that the use of orphan works is never allowed under any circumstances, no matter how limited or irrespective of the reason? Congress could act to make out of print works universally available under terms similar to the Authors Guild’s proposal in the Google Book Search settlement, but so what? The mere fact that Congress could in theory set out a system that is broader than the limited scope for orphan works display that would be viable as fair use does not mean that there is no fair use.

Whatever it turns out to be, there is no basis to think that the university defendants’ orphan works project is a substitute for orphan works legislation.

Library Digitization v. Library Photocopying

If you proceed from the assumption that all unauthorized uses of a book are piracy then it makes sense that every new technology is just a new version of the photocopier. The Authors Guild Appeal Brief certainly can certainly be read as adopting the latter view.

The brief argues that “[t]he mechanical conversion of printed books into digital form is not transformative because it does not add any ‘new information, new aesthetics, [or] new insights and understandings,’ to the books.” (citing Pierre Leval, Toward a Fair Use Standard, 103 Harv. L. Rev. 1105, 1111 (1990).) True, there is solid authority that photocopying and cable retransmission are not per se transformative (i.e., without looking at the reasons), but to suggest that library digitization offers no new insights is unsustainable.

Library digitization raises several different issues depending on the purpose behind that digitization and the uses that are subsequently made of the digitized texts. Library digitization could be motivated by any or all of the following:

  1. to preserve existing volumes
  2. to facilitate text-mining, data analysis and digital searching of the contents of books
  3. to facilitate access to electronic versions of books

The legal issues relating to each of these genres must be considered separately, but the Authors Guild’s brief muddles them altogether. Digitization does look a bit like other forms of copying if the motivating purpose is access or display of expressive works (i.e., #3 above). However, the argument in favor of a limited, non-commercial and education focused orphan works project turns not on transformative use, but on other considerations such as the lack of market harm [See Jennifer M. Urban, How Fair Use Can Help Solve the Orphan Works Problem (June 18, 2012)].

Likewise, the argument in favor of library digitization to facilitate disabled access is much broader than the details of the underlying technology. Whether we use the label transformative or not, this is clearly a favored purpose under the first fair use factor. The provision of equal access to copyrighted information for print-disabled individuals is mandated by the Americans with Disabilities Act (ADA). The HathiTrust provides print-disabled individuals with access to millions of items within library collections, whereas in the past they merely had access to a few thousand at best. “Making a copy of a copyrighted work for the convenience of a blind person is expressly identified by the House Committee Report as an example of a fair use, with no suggestion that anything more than a purpose to entertain or to inform need motivate the copying.” (Sony Corp. of Am. v. Universal City Studios, Inc, 464 U.S. 417, 455 n.40 (1984)).

The claim that library digitization is just like photocopying and does not offer any new insights crumbles completely when one considers the non-expressive uses such digitization makes possible. Library digitization makes it possible to extract meta-data from books and to create a useful search engine. Search indexing, text-mining and other computational uses of text could not be more different from mere photocopying; the “new information” and “new aesthetics” they offer include:

  • Text-based searching
  • Research on the structure of language
  • Research on the use of language.

The database as a whole serves a different purpose than each of the constituent works that have been scanned and indexed. The individual works provide content to readers, they convey the authors original expression. The database as a whole provides a means of searching for and identifying books or analyzing the language within books.

Labels like transformative use and nonexpressive use can be helpful in grouping like cases together, but they can also be distracting. The issue of fair use is directly tied to a purposive reading of the Copyright Act and the purpose of copyright is clearly articulated in the U.S. Constitution—“[t]o promote the Progress of Science and useful Arts. . . .”  As the Supreme Court stated in Campbell, the “central purpose” of the fair use investigation is to see, “whether the new work merely supersedes the objects of the original creation, or instead adds something new, with a further purpose or different character, altering the first with new expression, meaning, or message…”

The plaintiffs argue that library digitization is utterly untransformative, but in fact, digitization enabling book search and text-mining clearly leads to “new information, new aesthetics, new insights and understandings.”

For example, as we explained in the Digital Humanities Amicus Brief:

“Google’s “Ngram” tool provides another example of a nonexpressive use enabled by mass digitization—this time easily visualized. Figure 1, below, is an Ngram-generated chart that compares the frequency with which authors of texts in the Google Book Search database refer to the United States as a single entity (“is”) as opposed to a collection of individual states (“are”).


As the chart illustrates, it was only in the latter half of the Nineteenth Century that the conception of the United States as a single, indivisible entity was reflected in the way a majority of writers referred to the nation.  This is a trend with obvious political and historical significance, of interest to a wide range of scholars and even to the public at large.  But this type of comparison is meaningful only to the extent that it uses as raw data a digitized archive of significant size and scope. To be absolutely clear, 1) the data used to produce this visualization can only be collected by digitizing the entire contents of the relevant books, and 2) not a single sentence of the underlying books has been reproduced in the finished product. In other words, this type of nonexpressive use only adds to our collective knowledge and understanding, without in any way replacing, damaging the value of, or interfering with the market for, the original works.”

Library digitization is not the same as library photocopying.

Some observations on the Authors Guild’s Appeal Brief in Authors Guild v. Hathitrust (Part 1)

Introduction and Necessary Disclaimer

This is the first in a series of posts concerning the Authors Guild v. Hathitrust case. Most of the posts will be commentary on the Authors Guild Appeal Brief (February 25, 2013). Although I am one of the authors of the Digital Humanities and Law Scholars Amicus Brief, the views expressed on this site are purely my own. My comments on the Authors Guild Appeal Brief will not be comprehensive, rather, my aim is to review the aspects of the brief that I found interesting.

Authors Guild v. Hathitrust – Essential Background

Chances are that if you are reading this blog, you are well aware that Google has been mired in copyright litigation regarding its library digitization project. Google was sued by the Authors Guild (among others) in a class action on behalf of all authors in 2005. A controversial settlement of that class action proposed in 2008 generated a maelstrom of objections. The settlement was revised in 2009, but ultimately rejected by Judge Deny Chin in the Southern District of New York in March 2011. Authors Guild v. Google is ongoing (the class action certification is being appealed by Google, if Google loses its appeal that case goes back to Judge Chin in the Southern District of New York).

In September 2011, the Authors Guild (among others) filed claims for copyright infringement against the universities of Michigan, California, Wisconsin, Indiana and Cornell University for participating in the Google Book project. The Guild’s complaint with respect to the universities is, first, that they allowed Google to digitize their library collections, second, that the universities accepted corresponding digital files from Google and have consolidated those files into a shared digital repository known as the HathiTust digital library, and third that the universities’ proposed orphan works project (OWP) amounts to copyright infringement.

This is speculation on my part, but the Authors Guild may have been banking on a favorable ruling from Judge Chin being handed down before their separate case against the universities went to judgment. If so, they miscalculated. (If not, I honestly can’t understand why they did not drop the suit against the HathiTrust – it is usually not a great idea to run the same legal argument against more sympathetic defendants when you have a choice. That said, I am sure that the plaintiffs were well advised and had sound reasons for their tactics – it is just had to see from the outside what those reasons might have been.)

Authors Guild v. Hathitrust moved fairly quickly to the summary judgment phase. Oral argument was held on August 6, 2012 in the United States District Court for the Southern District of New York in front of Judge Baer. On October 10, 2012, Judge Baer ruled against the plaintiffs and held that two key aspects of the library digitization program and the HathiTrust were “transformative” as that term of art is used in copyright cases and, on balance, fair use.

Judge Baer approved library digitization

  1. to fulfill the requirements of the Americans with Disabilities Act by making suitable versions of books available to the visually impaired and
  2. to engage in non-expressive uses such as text-mining and building a search engine.

The Judge also held that the domestic ‘Associational Plaintiffs’ (e.g. the Authors Guild and similar organizations) did not have statutory standing under the Copyright Act and that the claims involving the Universities’ OWP were not ripe for adjudication.

Understandably, the Authors Guild and their fellow plaintiffs are now pursuing their appeal rights. The next post takes a deeper look at the Authors Guild Appeal Brief.

Some thoughts on the correct pronunciation of Sag

My Hungarian grandparents Nick and Lily fled Hungary in 1939. They traveled on foot with my infant father to a port in Italy. Nick made a dangerous side-trip to Paris to get money to bribe his way onto a ship bound for Australia and to pay the landing money the Australian government required of jewish immigrants. I am proud of my grandparents and my extended family in Europe, the U.S. and Australia. Also, although I have never actually visited Hungary, I have a certain sentimental attachment to that country as well.

Nonetheless, I have decided to officially give up on the correct pronunciation of my family name. I don’t speak Hungarian, I can’t actually pronounce my name with a Hungarian accent. My closest American relative assures me that it should be pronounced ‘Sag’ with a long ‘a’ (á as in father) or you might imagine a british person to say saga.

After more than a decade of trying to tow the line this I have decided that the whole enterprise is futile and misguided. My attempts to get the world to adopt an Americanized Hungarian pronunciation have not been that successful. For example, I heard one of my friends massacre the “A” in Matt (sounded like mARt, to make it the same as the “A” in Sag.

Feel free to try any pronunciation of Sag that you like, but from now on my official policy is that, just as Matt rhymes with cat, Sag rhymes with bag.

Other famous Sag’s include: the Sag gene which encodes the S-arrestin protein in humans; the

  • Saudi Arabian Government; various
  • State Attorneys General; the
  • SQL Access Group and the
  • Screen Actors Guild.

Sâg is also a village in Sălaj County, Romania. I have no idea how they say it.

The digital humanities is alive and well in South Bend, Indiana

I will be at Notre Dame on Friday, April 12, to give a lunchtime talk  to the Working Group on Computational Methods in the Humanities and Sciences on copyright, text analysis, and the legal issues involved in digital humanities research. I’ll be speaking at an event organized by Assistant Professor Matthew Wilkens who works on contemporary fiction, literary theory, digital humanities, and social studies of science.

Copyright law is based on a set of rules developed in the 18th Century to regulate the printing press. Today’s copyright law still carriers with it the legacy of print-era assumptions that have been profoundly disturbed by the digital economy. My talk will focus on the impact of successive waves of technology on copyright law and explain why the non-expressive use of copyrighted works by copy-reliant technologies presents a profoundly new issue for copyright law.

My interest in the digital humanities grew out of earlier work on Internet search engines and plagiarism detection software. Text mining software and other copy-reliant technologies do not read, understand, or enjoy copyrighted works, nor do they deliver these works directly to the public.  They do, however, necessarily copy them in order to process them as grist for the mill, raw materials that feed various algorithms and indices.

Logistical details on the talk are available here and here.


Richard Stallman will be joining us at Loyola Chicago to discuss Patents, Innovation and the Freedom to Use Ideas. Should be interesting.

The Loyola law Journal has organized another great conference.

This one day conference will provide a forum for nationally recognized scholars and judges to discuss the trade-off between two interests of the public: the interest in development of new ideas and the interest in freedom to use ideas. The patent system is intended to serve the former, but imposes a cost on the latter. More specifically, the Conference will explore whether the added innovation achieved by the patent system justifies its cost to society, whether it operates within the Constitution’s requirements, whether improvements can be made, and whether a different system or no system at all might be preferred.

Richard Stallman will be giving a special address on “Questioning the Assumptions of the Patent System”

April 11, 2013.

More details are available at



Symposium on Copyright Law and Gray Market Goods, John Wiley & Sons v. Kirtsaeng

Symposium on Copyright Law and Gray Market Goods, John Wiley & Sons v. Kirtsaeng

The DePaul Journal of Art, Technology & Intellectual Property Law is sponsoring a symposium on Copyright Law and Gray Market Goods, John Wiley & Sons v. Kirtsaeng on April 8, 2013 (12 – 3 p.m.)

In John Wiley & Sons, Inc. v. Kirtsaeng, 654 F.3d 210 (2d Cir. 2011), a publishing company brought an action against a defendant who was importing and selling textbooks within the United States. The defendant had relatives in Thailand purchase foreign editions of textbooks that were legally printed abroad. The relatives would send the textbooks to the defendant and the defendant would sell them for a profit. On appeal, the defendant argued that he should have been allowed to put forth a first sale defense.

The 2nd Circuit affirmed the district court’s rejection of a first sale defense based on a plain language interpretation of 17 USC § 602(a) and 17 U.S.C. § 109(a) and some dicta in Quality King Distributors, Inc. v. L’anza Research International, Inc., 523 U.S. 135 (1998). (Quality King involved goods that were manufactured within the United States, sold abroad and then re-imported). The Supreme Court granted certiorari. Oral arguments were heard on Oct. 29, 2012.

On March 19, 2013, Justice Breyer, writing for a majority of six, emphatically rejected the publisher’s control over the importation of legally manufactured “gray-market” products. The Court held that the “first sale” doctrine, which allows the owner of a copyrighted work to sell or otherwise dispose of that copy as he wishes, applies to copies of a copyrighted work lawfully made abroad. Justice Kagan filed a concurring opinion in which Justice Alito joined. Justice Ginsburg filed a dissenting opinion in which Justice Kennedy joined, and in which Justice Scalia joined except as to Parts III and V–B–1.

The slip opinion is available here.


Professor Tyler Ochoa, Santa Clara University College of Law

Kevin Tottis, Principal, Law Offices of Kevin Tottis

Professor Matthew Sag, Loyola University School of Law

Robert Paul, Director of Business Operations, Compass Lexecon

For registration pricing and event details, please visit:

I am the University of Technology Sydney today to make friends with the robots and talk about copyright

I am a guest this morning at the University of Technology Sydney’s “Innovation and Technology Research Laboratory”, better known within UTS as The Magic Lab. The Magic Lab has a broad spectrum of research interests including robot soccer, humanoid robotics, belief revision, virtual worlds, cognitive marketing, collaboration, risk management, commonsense reasoning and technology-driven innovation in addition to strategic, social and legal aspects of innovation.

As part of my visit today I will be presenting to the Engineering and information technology department as part of their Leadership in Innovation Seminar series. My presentation will address the interaction of copyright law and digital technology.