NEH grant awarded to build legal literacies for text data mining

I am thrilled to share the news that the National Endowment for the Humanities (NEH) has awarded a $165,000 grant to a team of legal experts, librarians, and scholars who will help humanities researchers and staff navigate complex legal questions in cutting-edge digital research. The team is led by UC Berkeley, but involves several other leading universities, including Loyola Law Chicago.

The NEH has agreed to support an Institute for Advanced Topics in the Digital Humanities to help key stakeholders learn to better navigate legal issues in text data mining. Thanks to the NEH’s $165,000 grant,  a national team (identified below) from more than a dozen institutions and organizations will run a summer institute to teach humanities researchers, librarians, and research staff how to confidently navigate the major legal issues that arise in text data mining research. 

Our institute is aptly called Building Legal Literacies for Text Data Mining (Building LLTDM), and will run from June 23-26, 2020 in Berkeley, California

Rachael Samberg of UC Berkeley Library’s Office of Scholarly Communication Services was our fearless leader in the grant proposal, Rachael’s amazing leadership and dedication can’t be overstated! More details on the grant can be found in Rachael Samberg’s post. But to give you some idea of the significance of this grant, here are a few comments from team members:

Building LLTDM team member Matthew Sag, a law professor at Loyola University Chicago School of Law and leading expert on copyright issues in the digital humanities, said he is “excited to have the chance to help the next generation of text data mining researchers open up new horizons in knowledge discovery. We have learned so much in the past ten years working on HathiTrust [a text-minable digital library] and related issues. I’m looking forward to sharing that knowledge and learning from others in the text data mining community.” 

Team member Brandon Butler, a copyright lawyer and library policy expert at the University of Virginia, said, “In my experience there’s a lot of interest in these research methods among graduate students and early-career scholars, a population that may not feel empowered to engage in “risky’ research. I’ve also seen that digital humanities practitioners have a strong commitment to equity, and they are working to build technical literacies outside the walls of elite institutions. Building legal literacies helps ease the burden of uncertainty and smooth the way toward wider, more equitable engagement with these research methods.”

Kyle K. Courtney of Harvard University serves as Copyright Advisor at Harvard Library’s Office for Scholarly Communication, and is also a Building LLTDM team member. Courtney added, “We are seeing more and more questions from scholars of all disciplines around these text data mining issues. The wealth of full-text online materials and new research tools provide scholars the opportunity to analyze large sets of data, but they also bring new challenges having to do with the use and sharing not only of the data but also of the technological tools researchers develop to study them. I am excited to join the Building LLTDM team and help clarify these issues and empower humanities scholars and librarians working in this field.”

Megan Senseney, Head of the Office of Digital Innovation and Stewardship at the University of Arizona Libraries reflected on the opportunities for ongoing library engagement that extends beyond the initial institute. Senseney said that, “Establishing a shared understanding of the legal landscape for TDM is vital to supporting research in the digital humanities and developing a new suite of library services in digital scholarship. I’m honored to work and learn alongside a team of legal experts, librarians, and researchers to create this institute, and I look forward to integrating these materials into instruction and outreach initiatives at our respective universities.”

Team Members

  • Rachael G. Samberg (University of California, Berkeley) (Project Director)
  • Scott Althaus (University of Illinois, Urbana-Champaign)
  • David Bamman (University of California, Berkeley)
  • Sara Benson (University of Illinois, Urbana-Champaign)
  • Brandon Butler (University of Virginia)
  • Beth Cate (Indiana University, Bloomington)
  • Kyle K. Courtney (Harvard University)
  • Maria Gould (California Digital Library)
  • Cody Hennesy (University of Minnesota, Twin Cities)
  • Eleanor Koehl (University of Michigan)
  • Thomas Padilla (University of Nevada, Las Vegas; OCLC Research)
  • Stacy Reardon (University of California, Berkeley)
  • Matthew Sag (Loyola University Chicago)
  • Brianna Schofield (Authors Alliance)
  • Megan Senseney (University of Arizona)
  • Glen Worthey (Stanford University)

Extended Readings on Copyright

After teaching copyright law based on my own materials for several years now I have decided to release my reading materials under a creative commons license. You can find these materials on my website at https://matthewsag.com/eroc/.

These Extended Readings on Copyright can be used as a textbook or as individual modules to supplement a textbook. Unlike a regular textbook, I don’t pretend that every important issue in copyright is addressed in these materials. The modules are arranged in the order that I teach them, but most can be used in any sequence. You should consider the current offering as a Beta version. I will post additional modules and revise the existing ones on a continual basis, major revisions will be noted on this website.

Chicago Marathon 2019

I am running the 2019 Chicago Marathon in honor of my sister Rebecca who died of pancreatic cancer late last year. Rebecca’s death is incredibly sad, but her life is worth celebrating. Something else worth celebrating is that cancer deaths have declined 20 percent in the US since the early 1990s.

We have a long way to go but we are actually making progress.

The American Cancer Society supports research, treatment, prevention, and education efforts. Please help me to help them by donating to the ACS and together, we can finish the fight against cancer!

You can support my running of the 2019 Chicago Marathon for the American Cancer Society at this address: (http://main.acsevents.org/goto/MattSag) or by mail.

IPSC 2018 Slides from my talk on Legal Infrastructure for Text Data Mining

I presented my working paper on the legal infrastructure for text data mining at IPSC yesterday and I promised to post my slides. Here it is: Public, Matthew Sag, Legal Infrastructure for TDM (IPSC August 2018). I won’t be posting a draft online for a while because I want to get more feedback from people actually working in this area. But if you would like an advanced draft, please email me.

What is TDM?

Neglect, but only of my website

I have not posted here in a long time, but I am still alive. Partly I have been busy some long term projects and some things that don’t fit the copyright and tech focus of this website. My work on the copyright implications of text data mining has lead to a series of projects actually doing text data mining. This has been fun and has lead to new insights about the copyright issues that have dominated a lot of work for the last decade.

Check out my new website devoted to empirical analysis of Supreme Court oral arguments: ScotusOA.com.

The DMCA Safe Harbors With Brief Annotations of Important Cases

I made an annotated version of Section 512 of the Copyright Act — the DMCA Internet Safe Harbors — for my Copyright Law class and I thought that others might find it useful. My thanks to Annemarie Bridy (University of Idaho College of Law) for her helpful suggestions and additions.

Please note that this document an aid to understanding the DMCA safe harbors, it is not comprehensive, nor is it guaranteed to be free from error. Draft date: April 26, 2017.

Download link: The DMCA Safe Harbors With Brief Annotations of Important Cases