Should Copyright Preemption Moot Anti-Scraping TOS Terms? (Guest Blog Post)

by guest blogger Kieran McCarthy

Many characterize the law of copyright preemption of contracts as a circuit split.

But that undersells the level of inconsistency in courts’ interpretations of the law of copyright preemption. It’s not that half of federal judges have adopted one clear stance on copyright preemption of contracts and the other half have adopted another clear stance. It’s that every new case related to the law of copyright preemption of contracts leaves lawyers with a potential new set of arguments to defend or argue against with the law of copyright preemption.

At the circuit court level, the law of copyright preemption of contracts is a circuit split-plus, with at least two and as many as four differentiating positions on what might constitute preemption.

At the district court level, the law of copyright preemption is a morass of ad hoc explanations of whether certain contracts are “equivalent” to the exclusive rights within the general scope of copyright law. Time and again, district court judges engage in abstract analyses of what constitutes equivalency, and the result of these ad hoc analyses is new precedent, far too often independent of prior history or context. The dicta in these cases are a legal and philosophical diaspora of opinions and comments capable of justifying almost every position on copyright preemption.

Section 106(1) of the Copyright Act says that “the owner of [a] copyright has the exclusive right[] to reproduce the copyrighted work.” Section 301(a) of the Copyright Act provides that “no person is entitled to any such right or equivalent right in any such work under the common law or statutes of any State.” With that, any state or common law claim that is equivalent to copyright must therefore be preempted.

There is a two-prong inquiry to determine when a state law claim is preempted: First, the work must be within the scope of the subject-matter of copyright as specified in 17 U.S.C. §§ 102103, and second, the rights granted under state or common law must be “equivalent” to any exclusive rights within the scope of federal copyright as set out in 17 U.S.C. § 106.

Of course, no state law claim is exactly the same as a copyright claim. And many state-law claims overlap to some degree with copyright claims. Which means that the question of when certain rights granted under a state- or common-law claim are “equivalent” to those in copyright requires a certain degree of philosophical analysis from our judges. And not surprisingly, when we require judges to engage in this type of philosophical analysis, it is not hard to find examples of courts that disagree with one another, and often to a startling degree.

Copyright preemption of contracts is a legal issue that’s been around for close to 50 years now, ever since Section 301 of the Copyright Act was enacted by Congress.

Legal questions about copyright preemption of online contracts have grown exponentially in importance in the last few years. All the major new AI and LLM platforms—including ChatGPT and StabilityAI—built their systems with data that was at least partially collected online. And that data might be subject to varying levels of copyright protection. And many of the sites where the data is collected also have prohibitions on automated data collection and web scraping in their terms of use.

Platforms that copy online data and use it to create AI have a strong fair use argument under copyright laws. But fair use isn’t a defense to a breach of contract claim. Which makes the question of copyright preemption of online contracts a vitally important one for any person or business that is looking to do anything important with online data.

One logical starting point to tell the history of copyright preemption of contracts is to begin with ProCD v. Zeidenberg, the 1955 Enchantment Under the Sea Dance of Internet legal opinions. In that case, Judge Easterbrook wrote, in finding that a “shrinkwrap” license was enforceable against the defendant:

But are rights created by contract “equivalent to any of the exclusive rights within the general scope of copyright”? Three courts of appeals have answered “no.” National Car Rental System, Inc. v. Computer Associates International, Inc., 991 F.2d 426, 433 (8th Cir.1993)Taquino v. Teledyne Monarch Rubber, 893 F.2d 1488, 1501 (5th Cir.1990)Acorn Structures, Inc. v. Swantz, 846 F.2d 923, 926 (4th Cir.1988). The district court disagreed with these decisions, 908 F.Supp. at 658, but we think them sound. Rights “equivalent to any of the exclusive rights within the general scope of copyright” are rights established by law — rights that restrict the options of persons who are strangers to the author. Copyright law forbids duplication, public performance, and so on, unless the person wishing to copy or perform the work gets permission; silence means a ban on copying. A copyright is a right against the world. Contracts, by contrast, generally affect only their parties; strangers may do as they please, so contracts do not create “exclusive rights.”

…[W]hether a particular license is generous or restrictive, a simple two-party contract is not “equivalent to any of the exclusive rights within the general scope of copyright” and therefore may be enforced.

ProCD, Inc. v. Zeidenberg, 86 F.3d 1447, 1454-55 (7th Cir. 1996)

A lot of people—particularly academics who disfavor the law and economics approach popularized at the University of Chicago—don’t like this opinion. Which is probably a big part of the reason that many judges have been eager to distance themselves from it.

But normative judgments aside, ProCD v. Zeidenberg is very clear with respect to copyright preemption. Contracts cannot be preempted by Section 301(a). If nothing else, litigants know where they stand in these jurisdictions.

This logic has been adopted by the Fifth, Eleventh, and Federal Circuits (and maybe the First Circuit).

The Sixth Circuit was the first court to expressly reject the logic of ProCD.

In Wrench Ltd. Liab. Co. v. Taco Bell Corp., 256 F.3d 446 (6th Cir. 2001), the court found that:

if the promise [in a contract] amounts only to a promise to refrain from reproducing, performing, distributing or displaying the work, then the contract claim is preempted. The contrary result would clearly violate the rule that state law rights are preempted when they would be abridged by an act which in and of itself would infringe one of the exclusive rights of § 106.”

Id. at 457–58

This gave birth to what preemption expert Guy Rub calls the “facts-specific approach” to copyright preemption.

With the facts-specific approach, the judge looks at the precise wording of the language of the contract and determines whether the prohibitions are “equivalent” to those in the Copyright. If they are, it is preempted. If they are not, it is not.

Based on my reading of the case law, the Fourth and Eighth Circuits broadly follow this approach. Based on my reading of lower court opinions, to date, courts in the Third Circuit also seem to follow the case-by-case approach. I only found two cases from the Tenth Circuit that substantively addressed copyright preemption of contracts, and both say that “a case-specific analysis of a breach of contract claim, instead of the mechanical per se rule” is the appropriate way to determine whether a contract is preempted by the Copyright Act. Health Grades, Inc. v. Robert Wood Johnson University Hosp., Inc., 634 F.Supp.2d 1226, 1246 (D. Colo. 2009) (holding that a contract was not preempted by copyright). See also Big Squid, Inc. v. Domo, Inc., 2019 WL 3555509 (D. Utah). (Same.)

The Ninth Circuit is a bit trickier to pin down.

Guy Rub, in his excellent article “Copyright Survives: Rethinking the Copyright-Contract Conflict,” suggested that the Ninth Circuit had adopted the ProCD v. Zeidenberg approach.

There is certainly an argument that the Ninth Circuit has adopted the logic of ProCD v. Zeidenberg. See Montz v. Pilgrim Films & Television, Inc., 649 F. 3d 975, 980 (9th Cir. 2011) (citing to ProCD in rejecting preemption in the context of a Desny claim).

In his recent post here, Prof. Rub hedged a bit when he wrote, “The case law in the Ninth Circuit—the other appellate circuit central to developing copyright law, especially regarding new technologies — seems to support the Seventh Circuit’s majority approach. However, it was sometimes not as clear as the case law of other circuits.”

Based on the materials I have reviewed, I agree that Prof. Rub is correct in saying that the Ninth Circuit strongly disfavors copyright preemption. But most cases I have seen seem to allow the possibility that some contracts could get preempted, if their only applicable prohibition is on copying and reusing content.

A good example is the Craigslist, Inc. v. 3Taps Inc. opinion from 2013, where the court said:

The Court need not decide, however, whether any contract could be preempted by the Copyright Act, because the contract that Craigslist alleges here involves a number of “extra element[s]” not merely “equivalent to” rights under the Copyright Act. See Montz,649 F.3d at 980. The relevant provisions of the TOU do not merely prohibit copying or reusing content, but rather include accessing the website for inappropriate purposes, using the website to develop computer programs and services that interact with Craigslist, and circumventing technological measures intended to restrict access to the website. TOU at 6-7. In return for users agreeing to the TOU, Craigslist provides services to its users “including but not limited to classified advertising, forums, and email forwarding.” TOU at 1. Because the Copyright Act generally does not preempt contracts, see Montz,649 F.3d at 980, and because the TOU includes these “extra element[s]” beyond the protections of the Act, see id., the Court concludes that Craigslist’s breach of contract claim is not preempted…

Id. at 976-977.

This language is typical for the Ninth Circuit. The court says that the Copyright Act does not generally preempt contracts, but it leaves the door open for the possibility that some contracts could be preempted.

At the other end of the spectrum is the new Second Circuit approach, recently elucidated in the ML Genius v. Google case from 2022. The Second Circuit now takes “a restrictive view” of the extra elements that would make a contract claim qualitatively different from copyright, and therefore not subject to preemption.

“[W]e take a restrictive view of what extra elements transform an otherwise equivalent claim into one that is qualitatively different from a copyright infringement claim.” Briarpatch, 373 F.3d at 306. “[E]lements such as awareness or intent” do not save a claim from preemption because they “alter the action’s scope but not its nature.” Comput. Assocs. Int’l, Inc. v. Altai, Inc., 982 F.2d 693, 717 (2d Cir. 1992). And “[i]f unauthorized publication is the gravamen of [the plaintiffs’] claim, then it is clear that the right they seek to protect is coextensive with an exclusive right already safeguarded by the Act—namely, control over reproduction and derivative use of copyrighted material.” Harper & Row Publishers, Inc. v. Nation Enters., 723 F.2d 195, 201 (2d Cir. 1983), rev’d on other grounds, 471 U.S. 539 (1985).

To be sure, we do not hold that breach of contract claims concerning copyrighted material are never preempted. We hold only that, given the specific facts Genius pleaded in its complaint, its breach of contract claim is not qualitatively different from a copyright claim and is therefore preempted.

This seems to flip the Ninth Circuit logic on its head. Whereas the Ninth Circuit tells us that contracts are generally not preempted, but that there is no absolute rule against preemption, the Second Circuit now has a rebuttable presumption that contracts that overlap with copyright are generally preempted, but that there is no absolute rule in favor of preemption.

One thing that I found hard to fathom about this opinion was that nowhere in the history of the ML Genius case did any court acknowledge—much less address or reconcile—the v. Verio, Inc. precedent. Given that v. Verio, Inc. was also a Second Circuit case, and it was also the first major circuit-court opinion that established what we now understand to be the law for the enforceability of an online “browsewrap” (insert Eric’s snark and anger emojis here) contract, failing to address that opinion leaves a giant hole in the case law.

Both ML Genius and v. Verio, Inc. were web scraping cases. Both involved accessing, copying, and publishing content against the wishes of the website host. In, the plaintiff that sought to prevent access prevailed. In ML Genius, the defendant that sought to maintain access to online data prevailed.

I suspect the real reason we have a different resolution here is because Google is Google and in the other case, Verio, Inc. was an internet spammer. But the end-use of the content shouldn’t matter for a breach of contract claim. What should have mattered was whether the contractual prohibition on copying and reusing content without permission was enforceable in the first place.

Does this mean that is no longer good law? Does this mean that all scraping-related contract cases are now presumptively preempted by copyright in the Second Circuit?

The fact that the answers to these questions are not obvious from the ML Genius opinion itself makes me think that the Second Circuit dropped the ball here.

If you’re keeping score at home, contracts—depending on jurisdiction—are never preempted by copyright (First, Fifth, Seventh, Eleventh, and Federal Circuits), presumptively not preempted (Ninth), presumptively preempted (Second), or dealt with on a case-by-case basis (Fourth, Sixth, Eighth, and so far, district courts in the Third Circuits and Tenth Circuits).

The reason I’m interested in preemption is because it dovetails with one of my main practice areas, web scraping. And every time I read a preemption case in the context of web scraping, I feel that the earth might be moving under my feet, but I have no idea where the tectonic plates are shifting.

Take, for example, the recent Thomson Reuters Enterprise Centre GmbH v. Ross Intelligence Inc. opinion out of the District Court of Delaware. 2023 WL 6210901.

Ross Intelligence is a legal software AI developer. Thomson Reuters owns Westlaw. Ross Intelligence hired a third party to create legal memoranda that Ross Intelligence could use as training data. Those memos were created manually and using a text-scraping bot, allegedly using Westlaw content. Thomson Reuters sued. Thomson Reuters sued for both copyright infringement and tortious interference with a contract (in inducing a third party to breach Westlaw’s terms).

The court concluded that two of the three contract claims were not preempted. According to the court:

The anti-bot and password sharing claims are not preempted.

The two other tortious-interference claims involve Westlaw’s anti-bot and password-sharing provisions:

You may not run or install any computer software or hardware on [West’s] products or network or introduce any spyware, malware, viruses, Trojan horses, backdoors or other software exploits.

Your access to certain products is password protected. You are responsible for assigning the passwords and maintaining password security. Sharing passwords is strictly prohibited.

These provisions are not equivalent to § 106 rights. Unlike the competition provision, they govern use and manipulation of the site. Using a bot to scrape content might copy material in bulk. And a claim based on the harm from that copying itself would be preempted. But a claim based on simply introducing malware, independent of that malware’s goals, is not equivalent to any right in § 106. Likewise, a site might ban password sharing because they want to limit copying risk. But putting limits on access to the site is a separate restriction. Whether the material behind the password protection is copyrighted or not, the creator can protect the material for which it charges users. Section 106 has nothing to say about that limit. So Thomson Reuters’s second and third tortious-interference claims survive preemption.

Thomson Reuters Enterprise Centre GmbH v. Ross Intelligence Inc. at 12. (emphasis added).

If the bolded sentences above were true, and other courts adopted this logic, it would eviscerate the existing legal standards nationwide related to web scraping.



But this judge, along with so many other judges, seems content to say that as long as you use a few magic words in your online agreement and characterize the harm from copying as harm related to something else, you can take a legal claim that is entirely motivated by the harm from copying and enforce it through an online contract.

Given the ease with which any competent attorney should be able to adjust ToU to meet the low bar suggested by precedent like Thomson Reuters, this becomes a very easy legal standard to manipulate.

This is an issue that is begging for Supreme Court review. And indeed, just in the last year, there was an opportunity for SCOTUS review, as the plaintiff in ML Genius v. Google petitioned for certiorari.

When deciding whether to take the case, Supreme Court invited the U.S. Solicitor General to file a brief on Genius’s petition.

The Solicitor General recommended denying the cert petition, mostly on the basis that the online “browsewrap” agreement was “atypical” and thus not the appropriate vehicle for resolving a dispute among the circuits.

In June of this year, the Supreme Court followed the recommendation and denied cert.

I’m not sure what world the Solicitor General is living in, but it isn’t the same one as I am. Almost every website on the Internet has an online contract, and most do not require any formal assent mechanism to access them.

Just in the last year, OpenAI released ChatGPT. StabilityAI exploded. Google changed its privacy policy to collect all “public” data (viz., online data governed by browsewrap agreements or no agreement) for training purposes with no opt-out. Most AI platforms were built on Internet data that was copied from sites with “browsewrap” online agreements.

Credible estimates now suggest that as much as 73% of all internet traffic is automated.

A lot of people, companies, and machines are collecting a lot of online data, at a scale and at a magnitude that few people can appreciate. The Internet is awash in lawsuits related to this collection of data.

The key question in many of those cases is whether this mass collection of training data on sites that ostensibly prohibit automated data collection are preempted by copyright (and whether the data collection for training data is fair use).

It’s basically the same fact pattern that the DOJ claimed was “atypical” and thus not suitable for resolving the circuit split on copyright preemption.

There is nothing atypical about this fact pattern. As more business models are being eaten by software, it’s the offline contracts that are becoming increasingly atypical.

The fact that the defendant in that matter was Google should have been a clue to the Solicitor General that this fact pattern was not atypical, but rather the vortex of one of the most important power struggles for the modern and future economy. Deciding who can control data, including data that is made available to the general public, is one the most important policy concern that judges and government officials are loathe to treat as an actual policy concern.

In the context of web scraping, my perspective on preemption is the opposite of ProCD and its successor cases. Web scraping is copying data online with automated processes. And to characterize zero-click online terms of use that are imposed by cease-and-desist letter as enforceable contracts is horrible policy and bad law. Web scraping can fit snugly within the domain of copyright law, and so there should be a categorical copyright preemption of online contracts purporting to govern use of public information. If content or data is not kept beneath a log-in (and thus entitled to protection via the CFAA and most state computer-trespass laws), it should be protectible only insofar as it is subject to existing intellectual property laws.

(To be fair, I think that some other common-law claims, such as misappropriation and unfair competition, that should be available against scrapers that engage in egregious misuse of public information in certain circumstances. But that’s a different article.)

The whole point of copyright preemption is that Congress sought to prevent states from infringing on the public domain and undermining key concepts of copyright law. But with the current legal regime of browsewrap agreements enforceable through cease-and-desist letters, we are allowing private firms to selectively control content in the public domain. If the intent of Congress with copyright preemption was to ensure that “no person is entitled to any such right or equivalent right in any such work under the common law or statutes of any State,” why are we giving companies this power through online contracts?

In the high-profile hiQ Labs case, the Ninth Circuit (twice) wrote:

We agree with the district court that giving companies like LinkedIn free rein to decide, on any basis, who can collect and use data—data that the companies do not own, that they otherwise make publicly available to viewers, and that the companies themselves collect and use—risks the possible creation of information monopolies that would disserve the public interest.

hiQ Labs I, 938 F.3d 985 at 1005; hiQ Labs II at 43.

This observation is as true now as the day when the Ninth Circuit made it, in 2019. But then, on remand to the district court in November 2022, the district court held that LinkedIn was entitled to free rein to decide, on any basis, who can collect and use data that it did not own, that they otherwise make publicly available to viewers, through a breach of contract claim established with a cease-and-desist letter.

If it is against public policy to allow companies to restrict access to public information through CFAA claims, as the Ninth Circuit has twice reiterated, why is not against public policy to allow companies to do this with threadbare, assent-free breach of contract claims?

Copyright law is a well-established field of law with fully developed contours of when someone is entitled to a monopoly with respect to information goods.

That is the proper vehicle for deciding when someone is allowed and not allowed to restrict access to public information.

At present, this is not the law anywhere in the United States. But it seems like a simple, obvious, and straightforward way to resolve what judges are making into an increasingly convoluted area of law.