A response to Lee and Grimmelmann

TIM LEE (@binarybits) and JAMES GRIMMELMANN have written an insightful article on “Why The New York Times might win its copyright lawsuit against OpenAI” in Ars Technica and on Tim’s newsletter (https://www.understandingai.org/p/the-ai-community-needs-to-take-copyright).

Quite a few people emailed me asking for my thoughts, so here they are. This is a rough first take that began as a tweet before I realized it was too long.

Yes, we should take the NYT suit seriously

It’s hard to disagree with the bottom-line that copyright poses a significant challenge to copy-reliant AI, just as it has done to previous generations of copy-reliant technologies (reverse engineering, plagiarism detection, search engine indexing, text data mining for statistical analysis of literature, text data mining for book search).

One important insight offered by Tim and James is that building a useful technology that is consistent with some people’s rough sense of fairness, like MP3.com, is no guarantee of fair use. People loved Napster and probably would have loved MP3.com, but these services were essentially jukeboxes competing with record companies’ own distribution models for the exact same content. We could add ReDigi to this list, too. Unlike the copy-reliant technologies listed above, Napster, MP3.com, and ReDigi fell foul of copyright law because they made expressive uses of other people’s expressive works.

Tim and James make another important point, that academic researchers and Silicon Valley types might have got the wrong idea about copyright. Certainly, prior to November 2022 you almost never saw any mention of copyright in papers announcing new breakthroughs in text data mining, machine learning, or generative AI. This is why I wrote “Copyright Safety for Generative AI” (Houston Law Review 2023).

Tim and James’ third insight is that some conduct might be fair use in a small noncommercial scale but not fair use on a large commercial scale. This is right sometimes, but in fact, a lot of fair use scales up quite nicely. 2 Live crew sold millions of copies of their fair use parody of Roy Orbison’s Pretty Woman, and of course, some of the key non-expressive use precedents were all about different versions of text data mining at scale: iParadigms (commercial plagiarism detection), HathiTrust (text mining for statistical analysis of the literature, including machine learning), Google Books (commercial book search).

But how seriously?

I agree with Tim and James that the AI companies’ best fair use arguments will be some version of the non-expressive use argument I outlined in Copyright and Copy-Reliant Technology (2009) and several other papers since, such as The New Legal Landscape for Text Mining and Machine Learning (2019).

In a nutshell, that argument is that a technical process that creates some effectively invisible copies along way but ultimately produces only uncopyrightable facts, abstractions, associations, and styles should be fair use because it does not interfere with the author’s right to communicate her original expression to the public.

I also agree that this argument begins to unravel if generative AI models are in fact memorizing and delivering the underlying original expression from the training data. I don’t think we know enough about the facts to say whether individual examples of memorization are just an obscure bug or endemic problem.

The NYT v. OpenAI litigation will shed some light on this but there is a lot of discovery still to come. My gut feeling is that the NYT’s superficially compelling examples of memorization are actually examples of GPT-4 working as an agent to retrieve information from the Internet. This is still a copyright problem, but it’s a very small, easily fixed, copyright problem, not an existential threat to text data mining research, machine learning, and generative AI.

If the GPT series models are really memorizing and regurgitating vast swaths of NYT content, that is a problem for OpenAI. If pervasive memorization is unavoidable in LLMs, that would be a problem for the entire generative AI industry, but I very much doubt the premise. Avoiding memorization (or reducing to trivial levels) is a hard technical problem in LLMs, but not an impossible one.

Avoiding memorization in image models is more difficult because of the “Snoopy Problem.” Tim and James call this the “Italian plumber problem,” but I named it first and I like Snoopy better.

The Snoopy Problem is that the more abstractly a copyrighted work is protected, the more likely it is that a generative AI model will “copy” it. Text-to-image models are prone to produce potentially infringing works when the same text descriptions are paired with relatively simple images that vary only slightly. 

Generative AI models are especially likely to generate images that would infringe on copyrightable characters because characters like Snoopy appear often enough in the training data that the models learn the consistent traits and attributes associated with those names. Deduplication won’t solve this problem because the output can still infringe without closely resembling any particular image from the training data. Some people think this is really a problem with copyright being too loose with characters and morphing into trademark law. Maybe, but I don’t see that changing.

How serious is the Snoopy Problem? Tim and James frame the problem as though they innocently requested a combination of [Nationality] + [Occupation] + “from a video game” and just happened stumble upon repeated images of the world most famous Italian plumber, Mario from Mario Kart.

But of course, a random assortment of “Japanese software developers” “German fashion designers” “Australian novelists” “Kenyan cyclists” “Turkish archaeologists” and a “New Zealand plumber” don’t reveal any such problem. The problem is specific to Mario because he dominates representations of Italian plumbers from video games in the training data.

The Snoopy Problem presents a genuine difficulty for video, image, and multimodal generative AI, but it’s far from an existential threat. Partly, because the class of potential plaintiffs is significantly smaller. There are a lot fewer owners of visual copyrightable characters than there are just plain old copyright owners. And partly because the problem can be addressed in training, by monitoring prompts, or by filtering outputs.

Tim and James’s final point of concern is that the prospect of licensing markets for training data will undermine the case for fair use. Companies building AI models rely on the fact that they are simply scraping training data from the “open Internet,” the argument becomes more persuasive when these companies are more careful to avoid scraping content from sites where they are not welcome.

Respecting existing robots.txt signals and helping to develop more effective ones in the future will facilitate robust licensing markets for entities like the New York Times and the Associated Press.

I don’t think that OpenAI will need to sign a 100 million licensing deals before training its next model. Courts have already considered and rejected the circular argument that copyright owners must be given the right to charge for non-expressive uses to avoid the harm of not being able to charge for non-expressive uses. This specific argument was raised by the Authors Guild in HathiTrust and Google Books and squarely rejected in both.

Tim and James and their note of caution with a note of realism: judges will be reluctant to shut down an innovative and useful service with tens of millions of uses. We saw a similar dynamic when the US Supreme Court held that time shift in using videocassette recorders was fair use.

But there is another element of realism to add. If the US courts reject the idea that non-expressive uses should be fair use, most AI companies will simply move their scraping and training operations overseas to places like Japan, Israel, Singapore, and even the European Union. As long as the models don’t memorize the training data, they can then be hosted in the US without fear of copyright liability.

Tim and James are two of the smartest most insightful people writing about copyright and AI at the moment. The AI community should take them seriously, they should take copyright seriously, but they should not see Snoopy (or the Italian Plumber) as an existential threat.

PS: Updated to correct typos helpfully identified by ChatGPT.

Third Annual Legal Scholars Roundtable on Artificial Intelligence 2024,

Call For Papers

Roundtable

Emory Law is proud to host the third annual Legal Scholars Roundtable on Artificial Intelligence. The Roundtable will take place on April 11-12, 2024 at Emory University in Atlanta, Georgia. The Legal Scholars Roundtable on Artificial Intelligence (AI) is designed to be a forum for the discussion of current legal scholarship on AI, covering a range of methodologies, topics, perspectives, and legal intersections.

Format  
Participation at the Roundtable will be limited and invitation-only. Participants are expected to read all the papers in advance and be prepared to offer substantive comments. We will try to accommodate a limited number of Zoom-based participants, but in person attendance is strongly preferred.

Applications to present, comment, or participate
We invite applications to participate, to comment, and/or to present from academics working on any topic relating to legal issues in AI. To request to present, you need to submit a substantially complete draft paper. Microsoft word format is strongly preferred for these purposes, but you can submit a pdf version for broader distribution. The deadline for submission is February 23, 2024, and decisions on participation will be made shortly thereafter, ideally, by March 4, 2024. If selected, final manuscripts are due April 1, 2024, to permit all participants an opportunity to read the papers prior to the conference.

To apply to participate, comment, or present, please fill out the google form:(https://forms.gle/Ubv2maLWfMK5tbPs8).

What to expect from the Legal Scholars Roundtable on Artificial Intelligence
The Legal Scholars Roundtable on Artificial Intelligence is a forum for the discussion of current legal scholarship on AI, spanning a range of methodologies, topics, perspectives, and legal intersections. Authors who present at the Roundtable will be selected from a competitive application process, and commentators are assigned based on their expertise. Participants will have an opportunity to provide direct feedback in paper sessions and will have access to draft papers but will be asked not to post papers publicly or share without author permission. Robust sessions involve energetic feedback from other paper authors, commentators, and participants. Our goal is to ensure all authors have the full participation of all workshop participants in each author’s session.

Essential logistics
The Roundtable will be held in person on the Emory campus in Atlanta, Georgia. The conference will begin on Thursday morning and run until 1PM on Friday. You can expect to be at the Atlanta airport by 1:45 PM, in time for a 2:30 PM flight or later on Friday. We will pay for your reasonable (economy) travel and accommodation expenses within the U.S. At the roundtable you will be well fed and caffeinated.

Organizers
Matthew Sag, Professor of Law in Artificial Intelligence, Machine Learning, and Data Science at Emory University Law School (msag@emory.edu)
Charlotte Tschider, Associate Professor at Loyola Law Chicago (ctschider@luc.edu)

Emory Law’s Commitment to AI
Emory University recognizes that artificial intelligence (AI) is a transformative technology that is already reshaping almost every aspect of our lives. Through its AI.Humanity initiative, Emory is building capacity in key areas of AI research and policy, including health care, medical research, business, law, and the humanities.

My testimony to the US Senate Judiciary Subcommittee on IP re: Copyright and AI

I had the great honor of testifying to the US Senate Judiciary Subcommittee on Intellectual Property in relation to Artificial Intelligence Copyright on Wednesday, July 12th, 2023.

Video and my written submission are available here: https://www.judiciary.senate.gov/artificial-intelligence-and-intellectual-property_part-ii-copyright and I have also linked to written statement here in case that other link is unavailable.

In my testimony I explained that although we are still a long way from the science fiction version of artificial general intelligence that thinks, feels, and refuses to “open the pod bay doors”, recent advances in machine learning AI raise significant issues for copyright law.

I explained why copyright law does not, and should not, recognize computer systems as authors and why training generative AI on copyrighted works is usually fair use because it falls into the category of non-expressive.

For more on copyright and generative AI, read Matthew Sag, Copyright Safety for Generative AI (Houston Law Review, Forthcoming) (https://ssrn.com/abstract=4438593)

Law School Academic Impact Rankings, with FLAIR

Cross-posted with Prawfsblog

I am pleased to announce the release of the Forward-Looking Academic Impact Rankings (FLAIR) for US law schools for 2023. I began this project two years ago because of my intense frustration that my law faculty (Loyola Chicago, at the time) had yet again been left out of the Sisk Rankings. The project has evolved and matured since then, and the design of the FLAIR rankings owes a great deal to debates that I have had with Prof. Gregory Sisk, partly in public, but mostly in private.

You can download the full draft paper from SSRN or wait for it to come out in the Florida State University Law Review.

How do the FLAIR rankings work?

I combined individual five-year citation data from HeinOnline with faculty lists scraped directly from almost 200 Law school websites to calculate the mean and median five-year citation numbers for every ABA accredited law school. Yes, that was a lot of work. Based on faculty websites, hiring announcements, and other data sources, I excluded assistant professors and faculty who began their tenure-track career in 2017 or later. I also limited the focus to what is traditionally considered to be the “doctrinal” faculty. The paper provides more details and the rationales for both of these decisions. 

How do the FLAIR rankings compare to other law school rankings?

Among their many flaws, the U.S News law school rankings rely on poorly designed, highly subjective surveys to gauge “reputational strength,” rather than looking to easily available, objective citation data that is more valid and reliable. Would-be usurpers of U.S. News use better data but make other arbitrary choices that limit and distort their rankings. One flaw common to U.S. News and those who would displace it is the fetishization of minor differences in placement that do not reflect actual differences in substance. In my view, this information is worse than trivial: it is actively misleading.

The FLAIR rankings use objective citation data that is more valid and reliable than the U.S. News surveys, and unlike the Sisk rankings, FLAIR gives every ABA accredited law school a chance have the work of its faculty considered. Obviously, it is much fairer to assess every school rather than arbitrarily excluding some based on an intuition (a demonstrably faulty intuition at that) that particular schools have no chance to ranking the top X%. Well, it’s obvious to me at least. But perhaps more importantly, looking out all the data gives us a valid context to assess individual data points. The FLAIR rankings are designed to convey relevant distinctions without placing undue emphasis on minor differences in rank that are substantively unimportant. This goes against the horserace mentality that drives so much interest in U.S. News, but I’m not here to sell anything.

What are the relevant distinctions?

The FLAIR rankings assign law faculties to four separate tiers based on how their mean and median five-year citation counts compared to the standard deviation of the means and mediums of all faculties. Tier 1 is made up of those faculties that are more than one standard deviation above the mean, Tier 2 is between zero and one standard deviations above the mean, Tier 3 ranges from the mean to half a standard deviation below, and Tier 4 includes all of the schools more than half a standard deviation below the mean. In other words, Tier 1 schools are exceptional, Tier 2 schools are above average, Tier 3 are below average, and Tier 4 are well-below average.

The figure below illustrates a boxplot for the distribution of citation counts for each tier. (There is a more complete explanation in the paper, but essentially, the middle of the boxplot is the median, the box around the median is the middle 50%, and the “whiskers” at either and are the lowest/highest 25%.) The boxplot figure below illustrates the substantial differences between the tiers, but it also underscores that there is nonetheless considerable overlap between tiers.

The FLAIR rankings

The next figure focuses on Tier 1. The FLAIR rank for each school is indicated in parentheses. The boxplot next to each school’s name indicates the distribution of citations for each doctrinal faculty member within that school.

Readers who pay close attention to the U.S. News rankings will note that the top tier consists of 23 schools, not the much vaunted “T14”. The T14 is a meaningless category; it does not reflect any current empirical reality or any substantial differences between the 14th and 15th rank. Attentive readers will also note that several schools well outside of the (hopefully now discredited concept of the) T14—namely U.C. Irvine, U.C. Davis, Emory, William & Mary, and George Washington—are in the top tier. These schools’ academic impact outpaces their overall U.S. News rankings significantly. U.C. Davis outperforms its U.S. News ranking by 42 places!

Looking at the top tier of the FLAIR rankings as visualized in the figure above also illustrates how misleading ordinal differences in ranking can be. There is very little difference between Virginia, Vanderbilt, and the University of Pennsylvania in terms of academic impact. The medians and the general distribution of each of these faculties are quite similar. And thus we can conclude that differences between ranks 6 and 8 are unimportant and that it is not news if Virginia “drops” to 8th or Pennsylvania rises to 6th in the FLAIR rankings, or indeed in the U.S. News rankings.

The differences that matter, and those that don’t

In the Olympics, third place is a bronze medal, and fourth place is nothing; but there are no medals in the legal academy and there is no difference in academic impact between third and fourth that is worth talking about. Minor differences in placement rarely correspond to differences in substance. Accordingly, rather than emphasizing largely irrelevant ordinal comparisons between schools only a few places apart, what we should really focus on is which tier in the rankings a school belongs to. Moreover, even when a difference in ranking suggests that there is a genuine difference in the overall academic impact of one faculty versus another, those aggregate differences say very little about the academic impact of individual faculty members. There is a lot of variation within faculties!

Objections to quantification

Many readers will object to any attempt to quantify academic impact, or to the use of data from HeinOnline specifically. Some of these objections make sense in relation to assessing individuals, but I don’t think that any of them retain much force when applied to assessing faculties as a whole. If we are really interested in the impact of individual scholars, we need to assess a broad range of objective evidence in context; that context comes from reading their work and understanding the field as whole. In contrast, no one could be expected to read the works of an entire faculty to get a sense of its academic influence. Indeed, citation counts, or other similarly reductive measures are the only feasible way to make between-faculty comparisons with any degree of rigor. What is more, aggregating the data at the faculty level reduces the impact of individual distortions, much like a mutual fund reduces the volatility associated with individual stocks.

One thing I should be very clear about is that academic impact is not the same thing as quality or merit. This is important because, although I think that the data can be an important tool for overcoming bias, I also need to acknowledge that citation counts will reflect the structural inequalities that pervade the legal academy. A glance at the most common first names among law school doctrinal faculty in the United States is illustrative. In order of frequency, the 15 most common first names are Michael, David, John, Robert, Richard, James, Mark, Daniel, William, Stephen, Paul, Christopher, Thomas, Andrew, and Susan. It should be immediately apparent that this group is more male and probably a lot whiter than a random sample of the U.S. population would predict. As I said, citation counts are a measure of impact, not merit. This is not a problem with citation counts as such, qualitative assessments and reputational surveys suffer the same problem. There is no objective way to assess what the academic impact of individuals or faculties would be in an alternative universe free from racism, sexism, and ableism. A better system of ranking the academic impact of law faculties will more accurately reflect the world we live in, that increased accuracy might help make the world better at the margins, but it won’t do much to fix underlying structural inequalities.

Corrections and updates

Several schools took the opportunity to email me with corrections or updates to their faculty lists in the past three months. If I receive other corrections that might meaningfully change the rankings, I will post a revised version.

Second Annual Legal Scholars Roundtable on Artificial Intelligence 2023

Call For Papers

Emory Law is proud to host the second annual Legal Scholars Roundtable on Artificial Intelligence. The Roundtable will take place on March 30-31, 2023 at Emory University in Atlanta, Georgia.  

The Legal Scholars Roundtable on Artificial Intelligence (AI) is designed to be a forum for the discussion of current legal scholarship on AI, covering a range of methodologies, topics, perspectives, and legal intersections.  

Format 

Between eight to ten papers will be chosen for discussion for Roundtable, with each paper allocated about an hour in total. Each paper will be introduced briefly by a designated commentator (5-10 minutes), with authors allowed an even briefer chance to respond (0-4 minutes), before general discussion and feedback from participants.   

Participation at the Roundtable will be limited and invitation-only. Participants are expected to read all the papers in advance and be prepared to offer substantive comments.  

Topics 

We invite applications to participate, to comment, and/or to present from academics working on any topic relating to legal issues in AI.  

Applications to present, comment, or participate 

Submissions to present can either be in the form of long abstract or a draft paper, the latter is preferred. Microsoft word format is preferred.  

The deadline for submission is February 10, 2023, and decisions on participation will be made shortly thereafter, ideally, by February 17, 2023. If selected, full papers are due March 1, 2023, to permit all participants an opportunity to read paper prior to the conference. Final submitted papers must be in substantially complete form. 

If you would like to make an early submission and request an early decision (because you need to plan for the semester), please do so.  

To apply to participate, comment, or present, please fill out the google form:­­­­­­­ https://forms.gle/7d71U5XUzp57pC7M8).  

What to expect from the Legal Scholars Roundtable on Artificial Intelligence  

The Legal Scholars Roundtable on Artificial Intelligence is a forum for the discussion of current legal scholarship on AI, spanning a range of methodologies, topics, perspectives, and legal intersections. Authors who present at the Roundtable will be selected from a competitive application process, and commentators are assigned based on their expertise.  

Participants will have an opportunity to provide direct feedback in paper sessions and will have access to draft papers but will be asked not to post papers publicly or share without author permission. Robust sessions involve energetic feedback from other paper authors, commentors, and participants. Our goal is to ensure all authors have the full participation of all workshop participants in each author’s session. 

Essential logistics 

The Roundtable will be held in person on the Emory campus in Atlanta, Georgia. The conference will begin on Thursday morning and run until 1PM on Friday. You can expect to be at the Atlanta airport by 1:30PM, in time for a 2:10PM flight or later on Friday.   

Organizers 

Matthew Sag, Professor of Law in Artificial Intelligence, Machine Learning, and Data Science at Emory University Law School (msag@emory.edu) 

Charlotte Tschider, Assistant Professor at Loyola Law Chicago (guest co-convenor)  

Emory Law’s Commitment to AI 

Emory University recognizes that artificial intelligence (AI) is a transformative technology that is already reshaping almost every aspect of our lives. Through its AI.Humanity initiative, Emory is building capacity in key areas of AI research and policy, including health care, medical research, business, law, and the humanities.  

Emory Law is aggressively recruiting experts in law and AI who will impact policy and regulatory debates, advise researchers on pathways for ethical and legal AI development, and train the next generation of lawyers.  

Emory Law has long had deep expertise in IP with patent law experts Prof. Margo Bagley and Prof. Tim Holbrook, and in Law & Technology generally thanks to Professor of Practice Nicole Morris, a recognized leader at the intersection of innovation, entrepreneurship and intellectual property. Professor Matthew Sag joined Emory Law in July 2022 as the school’s first hire under the AI.Humanity initiative. Sag is an internationally recognized expert on copyright law and empirical legal studies. He is particularly well known for his pathbreaking work on the legality of using copyrighted works as inputs in machine learning processes, a vital issue in AI. Emory Law’s second AI.Humanity hire, Associate Professor Ifeoma Ajunwa will join Emory Law in the 2023 academic year. Ajunwa’s research interests are at the intersection of law and technology with a particular focus on the ethical governance of workplace technologies.  Ajunwa’s forthcoming book, “The Quantified Worker,” examines the role of technology in the workplace and its effects on management practices as moderated by employment law. Emory Law expects to hire two additional AI researchers this year who will add to our expertise in the legal and policy implications of algorithmic decision-making and in data privacy law.  

As part of its commitment to leadership in the field of law and AI, Emory Law is now the permanent home of the Legal Scholars Roundtable on Artificial Intelligence, convened by Prof. Matthew Sag. 

Lessons for Empirical Studies of Copyright Litigation … A Case Study of Copyright Injunctions

This morning I presented Lessons for Empirical Studies of Copyright Litigation … A Case Study of Copyright Injunctions, CREATe@10 – Copyright Evidence: Synthesis and Futures, University of Glasgow October 17, 2022.

For those who missed the slides, here they are!

The presentation is based on Matthew Sag and Pamela Samuelson, Discovering eBay’s Impact on Copyright Injunctions Through Empirical Evidence forthcoming in the William & Mary Law Review 2023 ( https://ssrn.com/abstract=3898460)

I have moved to Emory University School of Law

Posts on this website are infrequent these days. But I thought it was worth mentioning that I have moved to Atlanta to take a position on the amazing Emory Law faculty. I was hired as a Professor of Law in Artificial Intelligence, Machine Learning, and Data Science as part of Emory’s bold new AI.Humanity initiative.

You can read the Emory announcement here: https://law.emory.edu/news-and-events/releases/2022/04/sag_joins_emory_law.html

Legal Scholars Roundtable on Artificial Intelligence: Call for Papers

Loyola University of Chicago is proud to present the first annual Legal Scholars Roundtable on Artificial Intelligence. The Legal Scholars Roundtable on Artificial Intelligence will take place online on March 18, 2022.

The Legal Scholars Roundtable on Artificial Intelligence is designed to be a forum for the discussion of current legal scholarship on AI, covering a range of methodologies, topics, perspectives, and legal intersections.

Between four and eight papers will be chosen for discussion for this inaugural roundtable, with each paper allocated up to an hour for discussion. Each paper will be introduced briefly by a designated commentator (3-8 minutes), with authors allowed an even briefer chance to respond (0-4 minutes), before general discussion.  

The Roundtable will be held Friday, March 18, 2022 beginning at 10:00AM Central Time to accommodate participants on the West Coast until 5:00 PM Central Time.

Participation at the Roundtable will be limited and invitation-only and participants are expected to have read the papers of other participants in advance and be prepared to offer substantive comments.

We invite applications to participate, to comment, and/or to present from academics working on any topic relating to legal issues in artificial intelligence including:

    • Competition/Antitrust
    • Consumer protection/regulatory law
    • Contract law
    • Corporations law
    • Criminal justice
    • Cybersecurity
    • Data privacy
    • Discrimination
    • Health law
    • Intellectual property
    • Tort law

To present, submissions must be substantially complete drafts in Microsoft word format. The deadline for submission is Friday, February 11, 2021 and decisions on participation will be made shortly thereafter, ideally, by February 18, 2022.

We anticipate this Legal Scholars Roundtable on Artificial Intelligence will bring together a diverse intellectual community, and we plan to sustain that community with a series of in-person and online conferences in the coming years. We invite you to be part of this inaugural event!

To apply to participate, comment, or present, please fill out the google form (https://forms.gle/yhXANrTAWHcJciHk9). Those wishing to present should also email their papers to msag@luc.edu. A subject line of “Legal Scholars AI 2022” would be helpful.  

The Roundtable will be convened by Loyola Chicago Professors, Matthew Sag and Charlotte Tschider. Matthew is a leading expert on the copyright implications of text data mining in the machine learning and AI context. Charlotte’s scholarship focuses on the implications of information privacy, cybersecurity, and artificial intelligence for the global health care industry. For further information about the Roundtable, please email either: Matthew Sag (msag@luc.edu) or Charlotte Tschider (ctschider@luc.edu).

#LegalScholarsAI

#LegalScholarsAI2022

So, you got a copyright infringement demand letter from Higbee & Associates?

Some context

In 2018 Jake Haskell and I published an article called “Defense Against the Dark Arts of Copyright Trolling” in the Iowa Law Review. The article focused on BitTorrent related litigation that accounted for roughly half of all copyright cases filed in the United States at the time. As we described in the article, in the typical BitTorrent case,

“the plaintiff’s claims of infringement rely on a poorly substantiated form pleading and are targeted indiscriminately at non-infringers as well as infringers. This practice is a subset of the broader problem of opportunistic litigation, but it persists due to certain unique features of copyright law and the technical complexity of Internet technology. The plaintiffs bringing these cases target hundreds or thousands of defendants nationwide and seek quick settlements priced just low enough that it is less expensive for the defendant to pay rather than to defend the claim, regardless of the claim’s merits.”

Given my interest in this topic, I get a lot of emails and phone calls asking about another high volume copyright plaintiff’s lawyer, Higbee & Associates.

I am writing this post so that people have something to go on without waiting for a response from me (which can often take a while, sorry).

Is Higbee & Associates a copyright troll?

Some people call Higbee & Associates (or the clients they represent) copyright trolls. Certainly, they seem more interested in monetizing infringement than simply stopping it. After all, they could use DMCA takedowns in most of these cases and it would be just as effective.

Fair point, but even if they are looking primarily to the rewards of the courthouse rather than the market place, they would no doubt respond that litigation is required to make people understand that photography is not free for the taking. The performing rights organization, ASCAP, files a lot of lawsuits for exactly this reason.

So, in terms of motive, the copyright troll label might not be a great fit, what about methods?

Higbee & Associates are a little different to the copyright trolls Jake and I discussed in Defense Against the Dark Arts of Copyright Trolling. As far as I know, they don’t make a habit of go after obvious non-infringers. Although they don’t seem to recognize many potential fair use arguments either. Also they don’t appear to rely on dodgy technology or bogus experts to make their case — a feature that is endemic of in the BitTorrent litigation.

However, Higbee does seem to send a lot of out letters of demand without much underlying depth. These letters often fail to provide a copyright registration. They often claim to represent a copyright owner who is not the author without evidencing any assignment of rights. You don’t need a registration to make a demand, but you absolutely need one to file a claim in federal court and to get statutory damages. So that seems a bit odd. Not connecting the dots between the person who took the photo and the client they say they represent is also a bit odd.

Moreover, the copyright troll label certainly fits with the sense of being ambushed that many defendants experience. I hear from a lot of these recipients. Receiving a letter from Higbee & Associates feels like an ambush because so many people don’t really understand how copyright works. It also feels like an ambush because the settlement amounts Higbee & Associates demand in a typical letter don’t seem to reflect the value of the underlying work.

Instead of demanding some multiple of the standard license fee for the work in question, Higbee will demand a settlement amount based on what they could get in court under copyright’s rather imprecise statutory damages rules. Which makes their oft noted failure to provide proof of registration even more interesting.

Assuming the work was registered at the relevant time, the prevailing plaintiff in copyright litigation can get statutory damages in the range of $750 to $150,000 per work infringed, regardless of the amount of actual damage. This is a pretty terrifying prospect for most accused infringers. But it gets worse. The real kicker is that if you fight the infringement accusation and lose, you risk just adding to your pain because if they are the prevailing party, the plaintiff has a good chance of getting their attorneys fees as well as statutory damages!

So, what to do?

Step one: figure out whether you have a good story to tell on the merits

You might have a case on the merits. Here are some examples:

  • you paid for a license to use the photo (or you thought you did);
  • you made fair use of the photo by using it as the foundation for commentary, parody or criticism (if you made changes to the photo that reinforce this transformative purpose, the merits of your fair use defense will be even clearer);
  • the party Higbee & Associates represents does not actually own the photo;
  • the photo was not registered with the U.S. Copyright Office before you started using it;
  • you didn’t post the photo, one of your users did it. This gets complicated. You might be covered by the DMCA, but only if you jump through the right hoops including registering an agent with the Copyright Office every three years. If you are not covered by the DMCA, you still might not be responsible for infringing acts by your users, it depends on a number of issues too detailed to summarized here.

Arguments on the merits that won’t help:

  • you didn’t post the photo, one of your employees did — sorry, you are responsible for your employees in a case like this.
  • you didn’t know the photo was copyrighted — this doesn’t help as much as you might think.
  • you thought that photos on the Internet were in the public domain — they aren’t.
  • you were not making a profit on your website — this doesn’t help as much as you might think.

Step two: ask for more information

Request copy of copyright registration, the deposit material that accompanied application, and documents sufficient to show Higbee is authorized by copyright owner to act as agent.

Explain that any settlement you agree to will have to contain proposed settlement a warranty that Higbee is the duly authorized agent of the copyright owner, that their client owns the copyright asserted, and that such copyright is valid. If they won’t do this, why not?

Step two: If you realize now that you might have been infringing the photographer’s copyright

  1. Take down the photo and audit the rest of the images on your website.
  2. If the work was unregistered. Do what your conscience tells you is right. The reality is that it is not worthwhile for them to take this case to court unless they can show actual damages of more than a few hundred dollars.
  3. If the work was registered and they actually represent the copyright owner, make a reasonable settlement offer.
    • What’s a reasonable offer? Based on the cases I have seen, probably, $1000 and go up to $1250 but your individual facts may vary.
  4. If the plaintiff won’t settle, don’t contest every point in the litigation. Instead try to keep everyone’s costs as low as possible; make an “offer of judgment” and hope that you get a reasonable judge who can see that there is no virtue in awarding more than $750 minimum in statutory damages. If you make this strategy clear to them, they should agree to a reasonable offer and move on to their next target.

Do you need a lawyer?

Probably, yes.

You could try to settle (or tell them to take a hike) by yourself, but without a lawyer representing you it’s hard to know how to respond to the arguments that the Higbee are going to throw back.

If you need a referral to a lawyer with experience in these matters, I can try to provide one. I don’t handle these cases myself. You should also know that because I am not your lawyer, any emails you send me are not going to be protected by attorney client privilege.

Good luck.