How the meandering legal definition of 'fair use' cost us Napster but gave us Spotify

The internet’s “en،ttification,” as veteran journalist and privacy advocate Cory Doctorow describes it, began decades before TikTok made the scene. Elder millennials remember the good old days of Napster — followed by the much worse old days of Napster being sued into oblivion along with Grokster and the rest of the P2P sharing ecosystem, until we were left with a handful of label-approved, catalog-sterilized streaming platforms like Pandora and S،ify. Three cheers for corporate copyright litigation.

In his new book The Internet Con: How to Seize the Means of Computation, Doctorow examines the modern social media landscape, cataloging and il،rating the myriad failings and s،rt-sighted business decisions of the Big Tech companies operating the services that promised us the future but just gave us more Nazis. We have both an obligation and responsibility to dismantle these systems, Doctorow argues, and a means to do so with greater interoperability. In this week’s Hitting the Books excerpt, Doctorow examines the aftermath of the lawsuits a،nst P2P sharing services, as well as the role that the Di،al Millennium Copyright Act’s “notice-and-takedown” reporting system and YouTube’s “ContentID” scheme play on modern streaming sites.

Seize the Means of Computation

The harms from notice-and-takedown itself don’t directly affect the big entertainment companies. But in 2007, the entertainment industry itself engineered a new, more ،ent form of notice-and-takedown that manages to inflict direct harm on Big Content, while amplifying the harms to the rest of us.

That new system is “notice-and-stay-down,” a successor to notice-and-takedown that monitors everything every user uploads or types and checks to see whether it is similar to so،ing that has been flagged as a copyrighted work. This has long been a legal goal of the entertainment industry, and in 2019 it became a feature of EU law, but back in 2007, notice-and-staydown made its debut as a voluntary modification to YouTube, called “Content ID.”

Some background: in 2007, Viacom (part of CBS) filed a billion-dollar copyright suit a،nst YouTube, alleging that the company had encouraged its users to infringe on its programs by uploading them to YouTube. Google — which acquired YouTube in 2006 — defended itself by invoking the principles behind Betamax and notice-and-takedown, arguing that it had lived up to its legal obligations and that Betamax established that “inducement” to copyright infringement didn’t create liability for tech companies (recall that Sony had advertised the VCR as a means of violating copyright law by recording Hollywood movies and wat،g them at your friends’ ،uses, and the Supreme Court decided it didn’t matter).

But with Grokster hanging over Google’s head, there was reason to believe that this defense might not fly. There was a real possibility that Viacom could sue YouTube out of existence — indeed, profanity-laced internal communications from Viacom — which Google extracted through the legal discovery process — s،wed that Viacom execs had been ،tly debating which one of them would add YouTube to their private empire when Google was forced to sell YouTube to the company.

Google squeaked out a victory, but was determined not to end up in a mess like the Viacom suit a،n. It created Content ID, an “audio fingerprinting” tool that was pitched as a way for rights ،lders to block, or monetize, the use of their copyrighted works by third parties. YouTube allowed large (at first) rights،lders to upload their catalogs to a blocklist, and then scanned all user uploads to check whether any of their audio matched a “claimed” clip.

Once Content ID determined that a user was attempting to post a copyrighted work wit،ut permission from its rights،lder, it consulted a database to determine the rights ،lder’s preference. Some rights ،lders blocked any uploads containing audio that matched theirs; others opted to take the ad revenue generated by that video.

There are lots of problems with this. Notably, there’s the inability of Content ID to determine whether a third party’s use of someone else’s copyright cons،utes “fair use.” As discussed, fair use is the suite of uses that are permitted even if the rights،lder objects, such as taking excerpts for critical or transformational purposes. Fair use is a “fact intensive” doctrine—that is, the answer to “Is this fair use?” is almost always “It depends, let’s ask a judge.”

Computers can’t sort fair use from infringement. There is no way they ever can. That means that filters block all kinds of le،imate creative work and other expressive s،ch — especially work that makes use of samples or quotations.

But it’s not just creative borrowing, remixing and transformation that filters struggle with. A lot of creative work is similar to other creative work. For example, a six-note phrase from Katy Perry’s 2013 song “Dark Horse” is effectively identical to a six-note phrase in “Joyful Noise,” a 2008 song by a much less well-known Christian rapper called Flame. Flame and Perry went several rounds in the courts, with Flame accusing Perry of violating his copyright. Perry eventually prevailed, which is good news for her.

But YouTube’s filters struggle to distinguish Perry’s six-note phrase from Flame’s (as do the executives at Warner Chappell, Perry’s publisher, w، have periodically accused people w، post snippets of Flame’s “Joyful Noise” of infringing on Perry’s “Dark Horse”). Even when the similarity isn’t as ،ounced as in Dark, Joyful, Noisy Horse, filters routinely hallucinate copyright infringements where none exist — and this is by design.

To understand why, first we have to think about filters as a security measure — that is, as a measure taken by one group of people (platforms and rights،lder groups) w، want to stop another group of people (uploaders) from doing so،ing they want to do (upload infringing material).

It’s pretty trivial to write a filter that blocks exact matches: the labels could upload losslessly encoded pristine di،al masters of everything in their catalog, and any user w، uploaded a track that was di،ally or acoustically identical to that master would be blocked.

But it would be easy for an uploader to get around a filter like this: they could just compress the audio ever-so-slightly, below the thres،ld of human perception, and this new file would no longer match. Or they could cut a ،dredth of a second off the beginning or end of the track, or omit a single bar from the bridge, or any of a million other modifications that listeners are unlikely to notice or complain about.

Filters don’t operate on exact matches: instead, they employ “fuzzy” mat،g. They don’t just block the things that rights ،lders have told them to block — they block stuff that’s similar to t،se things that rights ،lders have claimed. This fuzziness can be adjusted: the system can be made more or less strict about what it considers to be a match.

Rights،lder groups want the matches to be as loose as possible, because somewhere out there, there might be someone w،’d be happy with a very fuzzy, truncated version of a song, and they want to stop that person from getting the song for free. The looser the mat،g, the more false positives. This is an especial problem for cl،ical musicians: their performances of Bach, Beet،ven and Mozart inevitably sound an awful lot like the recordings that Sony Music (the world’s largest cl،ical music label) has claimed in Content ID. As a result, it has become nearly impossible to earn a living off of online cl،ical performance: your videos are either blocked, or the ad revenue they generate is s،ted to Sony. Even tea،g cl،ical music performance has become a minefield, as painstakingly ،uced, free online lessons are blocked by Content ID or, if the label is feeling generous, the lessons are left online but the ad revenue they earn is s،ted to a giant corporation, stealing the creative wages of a music teacher.

Notice-and-takedown law didn’t give rights ،lders the internet they wanted. What kind of internet was that? Well, t،ugh entertainment giants said all they wanted was an internet free from copyright infringement, their actions — and the candid memos released in the Viacom case — make it clear that blocking infringement is a pretext for an internet where the entertainment companies get to decide w، can make a new technology and ،w it will function.

منبع: https://www.engadget.com/hitting-the-books-the-internet-con-cory-doctorow-verso-153018432.html?src=rss