Book Review: Nick Seaver, Computing Taste: Algorithms and the Makers of Music Recommendation

(University of Chicago Press, 2022)

In Computing Taste, Nick Seaver provides an ethnographic exploration of the world of music recommendation systems, revealing how algorithms are deeply shaped by the humans who design them. He shows how the algorithms that drive music recommendations are shaped by human judgment, creativity, and cultural assumptions. The data companies collect, the way they construct models, how they intuitively test whether their models are working, and how they define success are all deeply human and subjective choices.

Beyond Man vs. Machine

Seaver points out that textbook definitions describe algorithms as “well-defined computational procedures” that take inputs and generate outputs, portraying them as deterministic and straightforward systems. This narrow view leads to a man-versus-machine narrative that is trite and unilluminating. Treating algorithms as though their defining quality is the absence of human influence reinforces misconceptions about their neutrality. Instead, Seaver advocates for focusing on the sociotechnical arrangements that produce different forms of “humanness and machineness,” echoing observations by Donna Haraway and others.

In practice, algorithmic systems are messy, constantly evolving, and shaped by human judgment. As Seaver notes, “these ‘cultural’ details are technical details,” meaning that the motivations, preferences, and biases of the engineering teams that design algorithms are inseparable from the technical aspects of the systems themselves. Therefore, understanding algorithms requires acknowledging the social and cultural contexts in which they operate.

From Information Overload to Capture

Seaver shows how the objective of recommendation systems has shifted from the founding myth of information overload to the current obsession with capturing user attention. Pioneers of recommender systems told stories of information overload that presented growing consumer choice as a problem in need of a solution. The notion of overwhelming users with too much content has been a central justification for creating algorithms designed to filter and organize information. If users are helpless in the face of vast amounts of data, algorithms become necessary tools to help them navigate this digital landscape. Seaver argues that the framing of overload justifies the control algorithms exert over what users see, hear, and engage with. The idea of “too much music” or “too much content” becomes a convenient rationale for developing systems that, in practice, do more than assist—they guide, constrain, and shape user choices.

In any event, commercial imperatives soon led to rationales based on information overload giving way to narratives of capture. Seaver compares recommender systems to traps designed to “hook” users, analyzing how metrics such as engagement and retention guide the development of algorithms. Seaver traces the evolution of recommender systems from their origins as tools to help users navigate the overwhelming abundance of digital content to their current role in capturing and retaining user attention. The Netflix Prize, a 2006 competition aimed at improving Netflix’s recommendation algorithm, serves as a key example of this shift. Initially, algorithms were designed to help users manage “information overload” by personalizing content based on user preferences, as Netflix sought to predict what users would enjoy. However, Netflix never used the winning entry. As streaming services became central to Netflix’s business model, the focus of recommendation systems shifted from merely helping users find content to keeping them engaged on the platform for as long as possible. This transition from personalization to attention retention shows the shift in the industry’s goals. Recommender systems, including those at Netflix, began to focus on encouraging continuous engagement by suggesting binge-worthy content to maximize viewing hours, implementing autoplay features to keep the next episode or movie rolling without user interaction, and focusing on actual viewing habits (e.g., “skip intro” clicks, time spent on a show, completion rates) rather than ratings to keep users hooked.

Seaver’s perspective is insightful, not unrelentingly critical. The final chapter investigates how the design of recommendation systems reflects the metaphor of a “park”—a managed, curated space that users are guided through. Recommender systems are neither strictly benign nor malign, but they do entail a loss of user agency. We, the listening public, are not trapped animals so much as a managed flock. Seaver recognizes that recommendation systems open up new possibilities for exploration while also constraining user behavior by narrowing choices based on past preferences.

Why Do My Playlists Still Suck?

The book also answers the question that motivated me to read it: why do my playlists still suck? No one has a good model for why we like the music that we like, when we like it, or how that extrapolates to music we haven’t heard yet. And Spotify and other corporate interests have no real interest in solving that puzzle for us. The algorithms that shape our cultural lives now prioritize engagement, rely on past behavior, and reflect a grab bag of assumptions about user preferences that are often in conflict. There is very little upside to offering us fresh or risky suggestions when a loop of familiarity will keep us more reliably engaged.

A response to Lee and Grimmelmann

TIM LEE (@binarybits) and JAMES GRIMMELMANN have written an insightful article on “Why The New York Times might win its copyright lawsuit against OpenAI” in Ars Technica and on Tim’s newsletter (https://www.understandingai.org/p/the-ai-community-needs-to-take-copyright).

Quite a few people emailed me asking for my thoughts, so here they are. This is a rough first take that began as a tweet before I realized it was too long.

Yes, we should take the NYT suit seriously

It’s hard to disagree with the bottom-line that copyright poses a significant challenge to copy-reliant AI, just as it has done to previous generations of copy-reliant technologies (reverse engineering, plagiarism detection, search engine indexing, text data mining for statistical analysis of literature, text data mining for book search).

One important insight offered by Tim and James is that building a useful technology that is consistent with some people’s rough sense of fairness, like MP3.com, is no guarantee of fair use. People loved Napster and probably would have loved MP3.com, but these services were essentially jukeboxes competing with record companies’ own distribution models for the exact same content. We could add ReDigi to this list, too. Unlike the copy-reliant technologies listed above, Napster, MP3.com, and ReDigi fell foul of copyright law because they made expressive uses of other people’s expressive works.

Tim and James make another important point, that academic researchers and Silicon Valley types might have got the wrong idea about copyright. Certainly, prior to November 2022 you almost never saw any mention of copyright in papers announcing new breakthroughs in text data mining, machine learning, or generative AI. This is why I wrote “Copyright Safety for Generative AI” (Houston Law Review 2023).

Tim and James’ third insight is that some conduct might be fair use in a small noncommercial scale but not fair use on a large commercial scale. This is right sometimes, but in fact, a lot of fair use scales up quite nicely. 2 Live crew sold millions of copies of their fair use parody of Roy Orbison’s Pretty Woman, and of course, some of the key non-expressive use precedents were all about different versions of text data mining at scale: iParadigms (commercial plagiarism detection), HathiTrust (text mining for statistical analysis of the literature, including machine learning), Google Books (commercial book search).

But how seriously?

I agree with Tim and James that the AI companies’ best fair use arguments will be some version of the non-expressive use argument I outlined in Copyright and Copy-Reliant Technology (2009) and several other papers since, such as The New Legal Landscape for Text Mining and Machine Learning (2019).

In a nutshell, that argument is that a technical process that creates some effectively invisible copies along way but ultimately produces only uncopyrightable facts, abstractions, associations, and styles should be fair use because it does not interfere with the author’s right to communicate her original expression to the public.

I also agree that this argument begins to unravel if generative AI models are in fact memorizing and delivering the underlying original expression from the training data. I don’t think we know enough about the facts to say whether individual examples of memorization are just an obscure bug or endemic problem.

The NYT v. OpenAI litigation will shed some light on this but there is a lot of discovery still to come. My gut feeling is that the NYT’s superficially compelling examples of memorization are actually examples of GPT-4 working as an agent to retrieve information from the Internet. This is still a copyright problem, but it’s a very small, easily fixed, copyright problem, not an existential threat to text data mining research, machine learning, and generative AI.

If the GPT series models are really memorizing and regurgitating vast swaths of NYT content, that is a problem for OpenAI. If pervasive memorization is unavoidable in LLMs, that would be a problem for the entire generative AI industry, but I very much doubt the premise. Avoiding memorization (or reducing to trivial levels) is a hard technical problem in LLMs, but not an impossible one.

Avoiding memorization in image models is more difficult because of the “Snoopy Problem.” Tim and James call this the “Italian plumber problem,” but I named it first and I like Snoopy better.

The Snoopy Problem is that the more abstractly a copyrighted work is protected, the more likely it is that a generative AI model will “copy” it. Text-to-image models are prone to produce potentially infringing works when the same text descriptions are paired with relatively simple images that vary only slightly. 

Generative AI models are especially likely to generate images that would infringe on copyrightable characters because characters like Snoopy appear often enough in the training data that the models learn the consistent traits and attributes associated with those names. Deduplication won’t solve this problem because the output can still infringe without closely resembling any particular image from the training data. Some people think this is really a problem with copyright being too loose with characters and morphing into trademark law. Maybe, but I don’t see that changing.

How serious is the Snoopy Problem? Tim and James frame the problem as though they innocently requested a combination of [Nationality] + [Occupation] + “from a video game” and just happened stumble upon repeated images of the world most famous Italian plumber, Mario from Mario Kart.

But of course, a random assortment of “Japanese software developers” “German fashion designers” “Australian novelists” “Kenyan cyclists” “Turkish archaeologists” and a “New Zealand plumber” don’t reveal any such problem. The problem is specific to Mario because he dominates representations of Italian plumbers from video games in the training data.

The Snoopy Problem presents a genuine difficulty for video, image, and multimodal generative AI, but it’s far from an existential threat. Partly, because the class of potential plaintiffs is significantly smaller. There are a lot fewer owners of visual copyrightable characters than there are just plain old copyright owners. And partly because the problem can be addressed in training, by monitoring prompts, or by filtering outputs.

Tim and James’s final point of concern is that the prospect of licensing markets for training data will undermine the case for fair use. Companies building AI models rely on the fact that they are simply scraping training data from the “open Internet,” the argument becomes more persuasive when these companies are more careful to avoid scraping content from sites where they are not welcome.

Respecting existing robots.txt signals and helping to develop more effective ones in the future will facilitate robust licensing markets for entities like the New York Times and the Associated Press.

I don’t think that OpenAI will need to sign a 100 million licensing deals before training its next model. Courts have already considered and rejected the circular argument that copyright owners must be given the right to charge for non-expressive uses to avoid the harm of not being able to charge for non-expressive uses. This specific argument was raised by the Authors Guild in HathiTrust and Google Books and squarely rejected in both.

Tim and James and their note of caution with a note of realism: judges will be reluctant to shut down an innovative and useful service with tens of millions of uses. We saw a similar dynamic when the US Supreme Court held that time shift in using videocassette recorders was fair use.

But there is another element of realism to add. If the US courts reject the idea that non-expressive uses should be fair use, most AI companies will simply move their scraping and training operations overseas to places like Japan, Israel, Singapore, and even the European Union. As long as the models don’t memorize the training data, they can then be hosted in the US without fear of copyright liability.

Tim and James are two of the smartest most insightful people writing about copyright and AI at the moment. The AI community should take them seriously, they should take copyright seriously, but they should not see Snoopy (or the Italian Plumber) as an existential threat.

PS: Updated to correct typos helpfully identified by ChatGPT.