Archives & Copyright: Developing An Agenda For Reform starts tomorrow #dh #archivescopyright

Archives & Copyright: Developing An Agenda For Reform

This is a one day symposium, co-organised by CREATe and the Wellcome Library. The symposium considers forthcoming changes to the copyright regime in the UK as it impacts the work of archives, as well as the role that risk-management plays in copyright compliance for archival digitization projects.

I will be speaking on a panel along with Professors Peter Jaszi and Peter Hirtle. We will discuss how cultural heritage institutions in the US work with copyright law, and in particular the ongoing Authors Guild v. HathiTrust case (currently on appeal).

I plan to talk about my experience bringing together (along with Jason Schultz and Matthew Jockers) the digital humanities amicus briefs for Authors Guild v. Hathi Trust I and II and Authors Guild v. Google. My slides are available right here.

The #hashtag for the symposium is #archivescopyright

Empirical Studies of Copyright Litigation: Can we rely on PACER’s Nature of Suit coding

I have just posted a new paper titled, Empirical Studies of Copyright Litigation: Nature of Suit Coding ( The paper investigates reliance on the Nature of Suit coding in the PACER records for empirical studies of copyright litigation. It concludes that although the PACER Nature of Suit for copyright does not in fact capture all copyright cases, it is a good enough sample for most purposes.

In spite of the increasing popularity of empirical legal studies more generally, there are relatively few empirical studies of copyright law, and even fewer of copyright litigation. This state of affairs cannot continue. The creation and distribution of copyrighted works is an important economic driver of the U.S. economy and copyright law’s interactions with freedom of expression and cultural participation have made it an area of significant public policy focus.  If we truly want to understand copyright litigation we need to examine then we need to look at LITIGATION and not just at cases. But before we go too far down the rabbit hole of docket analysis, someone needs to ask whether we are studying the right dockets.

As part of a broader ongoing study of copyright litigation I selected every case in the Lexis database published (by lexis, not necessarily designated as such by the court) between 2000 and 2012 that included the word “copyright”. The search was designed to be over-inclusive. From this broad sample, I randomly selected one fifth of the district court opinions and all of the court of appeals opinions.

A team of Loyola Law School students reviewed each opinion following a detailed coding form and determined, among other things, whether the case was truly a copyright case. Of the 472 cases coded, 102 were not copyright cases. More specifically, of the 137 court of appeals cases and 275 district court cases selected, 42 appeals cases and 60 district court cases only mentioned copyright in passing or in the course of discussing copyright case law but did not relate to a claim of copyright infringement.

Screen Shot 2013-09-24 at 6.59.33 AM

Determining the NOS coding for these true copyright cases was a simple, but laborious matter of cross-referencing the docket number with the PACER records. As set forth in Table 3, below, the almost 80% of district court cases and 85% court of appeals true copyright cases were filed as NOS=Copyright [820]. 

Screen Shot 2013-09-24 at 6.59.44 AM

The “other” category included: Contract, Cable/Sat TV, Other Statutory Actions, Insurance, Assault, Libel, & Slander, Other Personal Property Damage, Civil Rights, Fraud, Personal Injury and even some criminal filings. What is does this imply for empirical research? Most obviously, it implies that docket analysis of copyright disputes relying solely on the nature of suit coding misses one in five of the kind of copyright case that is likely to end up as a written opinion at the district court level.

Is 80% good enough? It’s not bad. If we assume that most attorneys are competent enough to know what the major focus of their case is, then the copyright cases that are overlooked by focusing solely on the 820 cohort are likely to be only partially about copyright. However, researchers should also be aware that some dockets that grow up to be copyright cases, even some that make it into text books, will be missed by reliance on the 820 coding. They should this understand that selection is probably not random and may not be inconsequential. Consider, for example the difference in duration between district level true copyright cases coded as NOS=820 and those that were not.

The average duration of terminated district court true copyright cases was 752 days (488 median) if the case was filed as NOS=820. For the corresponding set filed as something other than NOS=820, the average duration was 506 days (479 median). The average duration of unterminated district court true copyright cases as of January 1, 2013 was 1232 days (1074 median) if the case was filed as NOS=820. For the corresponding set filed as something other than NOS=820, the average duration was 1099 days (942 median). Figures 1 and 2, below, present the same information in the form of histograms indicating the distribution of duration for all four categories.

Screen Shot 2013-09-24 at 7.00.23 AM

Screen Shot 2013-09-24 at 7.04.01 AM

In simple terms, district court true copyright cases tended to be longer in average duration if filed as NOS=820, although it is noteworthy that they are not that different at the median.

What does all this mean for empirical studies of copyright litigation?
My conclusion is that, for copyright, at least, although the PACER Nature of Suit for copyright does not in fact capture all copyright cases, as long as researchers are clear about their methods and what data they are excluding, it is a good enough enough sample for most purposes.

Ivan Sag 1949 – 2013

Ivan was a large and brilliant man, the world feels like a smaller place without him. Ivan loved to drink, he loved to eat, he loved ideas, he loved his wife and he loved his friends. We loved him right back.

Ivan made significant contributions to the fields of syntax, semantics, pragmatics, and language processing. He wrote at least 10 books and over 100 articles. Ivan was the Sadie Dernham Patek Professor in Humanities, Professor of Linguistics, and Director of the Symbolic Systems Program at Stanford University. A fellow of the American Academy of Arts and Sciences and the Linguistic Society of America, in 2005 he received the LSA’s Fromkin Prize for distinguished contributions to the field of linguistics. All of which is to say that he was a brilliant wonderful man who I proudly call my uncle (even though he is in fact my first cousin, once removed). He will be missed.

A true scientist, Ivan was proud to live and die as an atheist.


Who owns the copyright in my Marathon playlist?

I will be running my very first marathon in October this year in Chicago. In connection with the marathon, I am raising money for the American Cancer Society. Almost all of us know someone who has suffered from cancer. There are many fine charities to support. I choose to support the American Cancer Society because they fund a range of research, patient services, early detection, treatment and education programs and because they seem like good people.

Please think about donating to the ACS and helping me reach exceed my fundraising goal of $1500.00. To donate, click on this link (
Screen Shot 2013-09-05 at 10.58.51 AM

If you donate $10 or more, I will add any song of your choosing to my Marathon playlist. So far the selected songs are:

  • Jingle Bells;
  • The Night Chicago Died by Paper Lace;
  • Shout to the Top’ by Style Council;
  • Dies Irae from Mozart’s Requiem;
  • Waltzing Matilda;
  • We Built This City” by Starship;
  • “Born to Run” by Bruce Springsteen;
  • “We are Never Getting Back Together” by Taylor Swift
  • Give Up The Funk by Parliament

The recent lawsuit in the UK where Ministry of Sound is suing Spotify for allowing users to recreate MOS compilations using spotify playlists makes me wonder whether I have any copyright in the playlist that results from my fundraising.

Actually, that was just a pretense to (a) blog about the fact that I am running the Marathon and (b) suggest that if you made it this far into the post, you should donate some money to help fight cancer.