Relitigating hiQ Labs and Scraping Through the Lens of the DMCA 1201 Anti-Circumvention (Guest Blog Post)

by guest blogger Kieran McCarthy

A series of prominent web-scraping lawsuits are revisiting the fundamentals of public data access. And in so doing, with a slight reframing of a relatively settled legal issue, major platforms are challenging the presumption that collecting and using public data at scale is legal.

In September 2019, the Ninth Circuit in hiQ v. LinkedIn wrote:

Although there are significant public interests on both sides, the district court properly determined that, on balance, the public interest favors hiQ’s position. We agree with the district court that giving companies like LinkedIn free rein to decide, on any basis, who can collect and use data—data that the companies do not own, that they otherwise make publicly available to viewers, and that the companies themselves collect and use—risks the possible creation of information monopolies that would disserve the public interest.

hiQ Labs, Inc. v. LinkedIn Corporation, 938 F.3d 985, 1004 (9th Cir. 2019) (vacated by SCOTUS, but this language was later re-affirmed verbatim by the Ninth Circuit in 2022).

It’s hard to overstate the importance of this language for the web-scraping industry. For the first time, there was a court taking a public-policy stand in favor of web scraping. This was not just an instance where a court said that scraping did not constitute a crime. This was a circuit court taking a stand on policy grounds that it was in the public interest for public data on private fora to remain publicly available for public consumption. The Ninth Circuit seemed to fully embrace the idea that allowing public use of public data was vital for a functioning digital information ecosystem.

For those that operate in data access, broadly defined, hiQ v. LinkedIn became shorthand for a simple proposition: Scraping publicly accessible web pages was legal. And while the truth was always a bit more nuanced, and hiQ ultimately capitulated in the case and went under, the legality of public data access was never questioned by industry insiders after the decision.

The last six years have been an absolute boon for the web-scraping industry. And it is not hyperbole to say that the AI revolution might not have come, or at least not have come as fast, had it not been for strong policy presumption in favor of legal data access to public content after hiQ Labs.

Fast-forward six years, and data-access questions are more critical than ever. All the world’s information is there for the taking and there is no shortage of people and companies doing the taking.

In just a few years, AI has become an integral component of almost all knowledge work. And it is arguably the single biggest engine driving economic growth right now.

And the fuel that powers that engine is data—usually public data collected at scale. And with that, there are dozens of lawsuits percolating over the legalities of when it is and is not permissible to copy and reuse public data.

But with CFAA questions about publicly available data now largely resolved (except at the margins), and with terms of use often preempted by copyright, those looking to build walled gardens have been angling for new arguments to restrict access to public data.

In a recent wave of disputes over AI and “public data,” a new pattern has emerged. Plaintiffs are trying to rerun the hiQ fact pattern under a different federal statute.

That statute is the Digital Millennium Copyright Act (DMCA), specifically Section 1201.

While hiQ mostly resolved the CFAA-era battle over whether “public” means “authorized,” Section 1201 is the new battlefield over whether antibot tech is a technological measure that “effectively controls access” to a copyrighted work.

Created by ChatGPT Dec. 2025

But make no mistake, from the data monopolist’s perspective (or oligopolist, if we’re being more precise here), the goal is the exact same as it was in 2019, to restrict data collectors’ access to public data. The largest incumbent platforms such as Google want “free rein to decide, on any basis, who can collect and use data—data that the companies do not own, that they otherwise make publicly available to viewers, and that the companies themselves collect and use.” Companies like Google want to control the data. Many companies (like Reddit, Yahoo, X, and others) want to license the data (even to the extent that the data is generated by their users). But almost all the major platforms are making renewed legal pushes to keep startups off their digital lawns (while at the exact same time gobbling up the same type of content from others).

Section 1201 as an alternative to copyright, the CFAA, and terms of use claims

There are a few different reasons why 1201 is attractive to plaintiffs in scraping and AI-data cases.

First, in some circuits, plaintiffs are trying to use 1201 to avoid (dare I say, circumvent?) the copyright fair use fight. There is a circuit split on the issue, but in the major circuits where this gets litigated (notably, the Second and Ninth), plaintiffs often argue that fair use is no defense to a Section 1201 anti-circumvention claim. In some jurisdictions, defendants may not be able to rely on fair use as a categorical defense to a §1201 claim, because courts treat §1201 as analytically distinct from infringement. And there are many situations where that distinction might be dispositive. Well-designed AI systems that train on large datasets are often going to be deemed fair use because they are highly transformative. Under the DMCA, though, that might not matter.

Second, plaintiffs may be able to avoid the copyright registration prerequisite that constrains many copyright claims. At least as alleged in the recent YouTube-scraping cases, plaintiffs emphasize that DMCA anti-circumvention claims do not depend on copyright registration in the same way copyright infringement claims do.

Third, the DMCA allows plaintiffs to allege that the tech itself is inherently unlawful. 1201(a)(2) lets plaintiffs target the tooling layer (Captcha solvers, challenge tools, proxy services, and other antibot solution tech) as “trafficking” in circumvention tech. Google’s new case leans heavily into that line of argumentation.

Courts have been wary of turning Section 1201 into a general-purpose “keep out” sign, as Eric recently noted in the Ziff-Davis case. But this new line of cases will test those limits once again. Older doctrine around “effective” technological measures and the relationship between circumvention and copyright protection shows up repeatedly in the case law discussions (think Lexmark and Chamberlain). But those were not Second or Ninth Circuit cases. And those cases were not resolved in the modern era of high-volume data access.

Google v. SerpApi and the SearchGuard blueprint (ND Cal, Dec. 2025)

Google’s recent lawsuit against SerpApi is the most on-the-nose “DMCA-first” pleading to date.

Google sued SerpApi in the Northern District of California, alleging SerpApi used massive volumes of fake search requests to bypass Google protections and resell content from search results. Google also publicly framed the suit as targeting “circumventing security measures protecting others’ copyrighted content that appears in Google search results.”

The complaint is explicit. It is a DMCA case from page one, asserting claims for circumvention (17 U.S.C. § 1201(a)(1)(A)) and trafficking in circumvention tech (17 U.S.C. § 1201(a)(2))

And it provides a detailed narrative about the technological measure itself. Google alleges it built SearchGuard, a system designed to block automated access without breaking normal user experience, including a JavaScript “challenge” that automated systems typically cannot solve at scale.

SerpApi’s public response is the classic hiQ-adjacent argument. We provide the same information any person can see in a browser without signing in.

Google’s complaint tries to make that defense irrelevant by focusing on: (1) the presence of licensed copyrighted material in SERPs, and (2) the alleged circumvention of SearchGuard to access it at scale.

That framing matters because it tries to make the “public vs. private” question less central. Instead, the center becomes: “was there an access-control measure, and did defendants circumvent it?”

I’ll give Google’s lawyers credit. They did a good sales job on this.

But don’t get it twisted: Google’s 1201(a) pitch that SearchGuard is a TPM stinks like two-month-old garbage in the Texas summer heat. A TPM, as Congress imagined it, is a lock on access to a copyrighted work, a measure that actually gates entry to protected expression. SearchGuard, by Google’s own framing, is a traffic-cop. It’s a system for managing how you reach pages, at what rate, with which client, under which terms. That’s not controlling access in any meaningful copyright sense; it’s controlling conduct. Conflating those two isn’t a clever reading. It’s a category error dressed up as inevitability. If SearchGuard is a TPM, then literally every site that throws up a speed bump can build an impenetrable copyright perimeter around almost any content whatsoever, and the statute becomes a magic spell. Add friction, say TPM, and now you can replace copyright law with your own proprietary preferences and they become federally enforceable.

That’s the crazy part. Google’s argument doesn’t just misread 1201(a). It turns the anti-circumvention statute into an anti-competition superweapon that will destroy existing copyright protections.

If Google pulls this off, they will have reshaped copyright law as a form of property rights in interfaces, where circumventing a company’s desire is treated like circumventing encryption. That doesn’t protect creators. It protects incumbents. It chills security research, accessibility work, archiving, interoperability, independent auditing, and every weird-but-legitimate tool that makes the internet function like it should rather than merely consumable on platform-approved terms.

Let’s be honest: This isn’t Google vs. circumvention, so much as Google wanting to build a DMCA moat around its own monopoly in search. And if successful, it will provide a handy playbook for any other platform looking to build or defend any online monopoly in any other vertical.

Tech writer Mike Masnick explains the potential consequences:

Google’s argument, if accepted, provides a roadmap for any website operator who wants to lock down their content: slap on a trivial TPM—a CAPTCHA, an IP check, anything—and suddenly you can invoke federal law against anyone who figures out how to get around it, even if their purpose has nothing to do with copyright infringement.

The implications spiral outward quickly. If Google succeeds here, what stops every major website from deciding they want licensing revenue from the largest scrapers? Cloudflare could put bot detection on the huge swath of the internet it serves and demand Google pay up. WordPress could do the same across its massive network. The open web—built on the assumption that published content is publicly accessible for indexing and analysis—becomes a patchwork of licensing requirements, each enforced through 1201 threats.

That doesn’t seem good for the prospects of a continued open web.

This is the “relitigating hiQ” move. If accessing a public page is a CFAA non-starter, then plaintiffs try to win by arguing the defendant defeated a technological control to access the same public page, even if the end use is not infringing.

Nvidia, Ziff-Davis, Reddit, and a host of class-action cases are pursuing similar claims along similar themes.

DMCA 1201 as the new anti-scraping front

If you zoom out, these new complaints share a single strategic philosophical theme.

Plaintiffs allege that it does not matter whether content is public and available to anyone with a web-browser. If antibot tech is designed to make certain content inaccessible at scale to automation, anyone who accesses it at scale through is violating the DMCA.

hiQ mostly put an end to hyper-aggressive CFAA allegations for public web access. But the current litigation wave shows a sophisticated and equally aggressive platform response. Swap the CFAA question (“authorized?”) for the DMCA question (“circumvented?”). That is the simple reframing at the center of the most important fights over public data.

Does this change the underlying public policy concerns described in hiQ Labs? And will those concerns highlighted in hiQ Labs inform the interpretation of the DMCA in this context? Only time will tell.