data sets

Data in academic papers should be public by default

The datasets on this page are made available to so that you can look at my data and replicate my results. Please let me know about any errors or discrepancies.

My Public Datasets

IP Litigation in US District Courts 2015 Update

Data derived from Pacer 1994 to 2015 (excel) (stata); Data derived from Bloomberg 1994 to 2015 (excel)(stata). These files  contain my own assessment of whether the case was filed against an unknown party (mostly these are titled “Doe” or “John Doe”) and whether the plaintiff was suing in relation to pornography.  Assessments of  ‘pornography’ or ‘not pornography’ were based on reviews of selected complaints, internet searches for the plaintiffs and/or the titles alleged to have been infringed. It is important to note that I assumed that if XYZ Corp. filed one case in relation to pornography that all its cases were about pornography.

IP Litigation in US District Courts 1994-2014

Download in Excel or Stata. These files  contain my own assessment of whether the case was filed against an unknown party (mostly these are titled “Doe” or “John Doe”) and whether the plaintiff was suing in relation to pornography.  Assessments of  ‘pornography’ or ‘not pornography’ were based on reviews of selected complaints, internet searches for the plaintiffs and/or the titles alleged to have been infringed. It is important to note that I assumed that if XYZ Corp. filed one case in relation to pornography that all its cases were about pornography. Note that the data is sourced from PACER and Bloomberg and that there are inconsistencies between these two.

Copyright Trolling

Copyright Filing Data (to June 30 2014)  — This excel file contains basic information on all copyright cases filed in US district courts from 2001 to June 30, 2014. In addition to the data that you could get from PACER it contains url’s to go straight to the docket in Bloomberg (if you have an account) (I suggest you create a column with the function “=Hyperlink(TARGET CELL)” to make this a one-click exercise.) The file also contains my own assessment of whether the case was filed against an unknown party (mostly these are titled “Doe” or “John Doe”) and whether the plaintiff was suing in relation to pornography.  Assessments of  ‘pornography’ or ‘not pornography’ were based on reviews of selected complaints, internet searches for the plaintiffs and/or the titles alleged to have been infringed. It is important to note that I assumed that if XYZ Corp. filed one case in relation to pornography that all its cases were about pornography.

Full replication files in Stata are available upon request.

 

Predicting Fair Use

Matthew-Sag-Predicting-Fair-Use (excel file)

The data was sourced from the underlying cases, Barton Beebe’s earlier empirical study of fair use decisions the Martindale Hubble directory and Hoovers, a corporate information database available on Lexis.com. Case selection is explained in the paper. I had always meant to clean this data up before posting it, but this may never happen.  Stata .dta file and replication instructions available on request.