Researchers’ Challenge to CFAA Moves Forward–Sandvig v. Sessions
This is a lawsuit brought by four professors and a media organization (First Look, publisher of the Intercept). Plaintiffs study real estate, finance, and employment transactions and seek to highlight the discriminatory effects of algorithms. To do so, they create fake profiles, including profiles for minorities, and test the profiles. The court describes this as akin to testing for discrimination in the housing or loan markets. For example, plaintiffs intend to use bots to create fake profiles which then will surf real estate websites, simulating the behavior of minority groups. The plaintiffs intend to then scrape the websites to record the displayed properties. Similarly, several of the other plaintiffs intend to use bots to crawl job-seeker profiles, then create fake employer profiles so they can search for candidates and see how they are ranked. They also intend to create fake job-seeker profiles and have these fictitious job-seekers apply for fictitious jobs, to see how algorithms rank candidates. Both the professors and First Look intend to publicize their findings.
They all contend their actions leave them susceptible to the risk of prosecution under the CFAA. They brought an action for declaratory relief alleging First Amendment and Due Process Claims.
The Internet is Special: Before tackling the standing question, the court inquires about the “First Amendment status of the Internet.” The court acknowledges that off-line speakers have not persuaded courts that their rights trump the rights of property owners. However, the court says “the internet is different”! The court cites to Packingham as an illustration of the forum-tweaking rules of the internet. While the court says that it can’t equate the “entirety” of the internet with public streets and parks, many sites are “too heavily suffused with First Amendment” activity to be treated purely like physical private property. This does not mean that a person could freely rifle through a business’s confidential files just because they are on the cloud. On the other hand, the court says it can closely look at the means website owners put in place to regulate access and use of the underlying information.
Standing: In the pre-enforcement context, a plaintiff has to establish that she has an intention to engage in conduct (1) that is affected with a constitutional interest; (2) that is proscribed by statute and (3) which gives risk a credible risk of prosecution.
The court says plaintiffs’ activity has a constitutional dimension, among other things, because:
scraping plausibly falls within the ambit of the First Amendment.
The court says cases broadly recognize the right to record “matters of public interest.” Scraping, at least as it encompasses information located in a “public forum,” falls within this right. The court says plaintiffs also have an interest in making “harmless misrepresentations” to websites. Citing to US v. Alvarez, the stolen valor case, the court says that plaintiffs’ white lies are alleged to cause minimal harm to websites and are arguably constitutionally protected. Finally, plaintiffs say they have the right to freely publish the byproducts of those efforts. Citing to Bartnicki, the court says this is right.
The government raises a state action question and the court (with little discussion, and a cite to NY Times v. Sullivan and hiQ v. LinkedIn) disagrees. The court also rejects the government’s argument that the plaintiffs have no right to acquire the information they seek in any manner they wish.
The court says the government treats plaintiffs’ argument that their conduct is proscribed by statute too summarily to warrant serious consideration. The government only contested plaintiffs’ standing argument based on their activity being proscribed by the statute as it related to plaintiffs’ plans to publish information. Plaintiffs’ adequately alleged that their reporting of their findings would violate certain websites’ terms of services.
Finally, the court says plaintiffs have adequately alleged a credible threat of prosecution. Since plaintiffs allege First Amendment claims, their threat of prosecution bar is low, and the court says they satisfy it. First, they point to various edge cases (including US v. Drew). Second, the government had not expressly disavowed prosecution in these circumstances. While the government has stated in several contexts that it has no interest in prosecuting “harmless violations,” the disavowals are not unequivocal.
Ultimately, the court says that the focus is on what information plaintiffs wish to access, not what they want to do with that information. And against this backdrop, the court says the bulk of plaintiffs’ activities fall outside the CFAA:
Scraping or otherwise recording data from a site that is accessible to the public is merely a particular use of information that plaintiffs are entitled to see. The same goes for speaking about, or publishing documents using, publicly available data on the targeted websites. The use of bots or sock-puppets is more context-specific activity, but it is not covered in this case. Employing a bot to crawl a website or apply for jobs may run afoul of a website’s ToS, but it does not constitute an access violation when the human who creates the bot is otherwise allowed to read and interact with that site.
The court says that creating fake accounts where the person is not otherwise entitled to use the information would violate the access provision. Thus, the court must assess the constitutional claims.
Plaintiffs’ First Amendment Claims: Plaintiffs assert an overbreadth and an as-applied challenge.
As limited by the court, the statute covers terms of service restrictions that limit access to certain information. While this is not identical to plaintiffs’ limiting construction, it’s narrow and, in the court’s view, undermines any overbreadth challenge.
As to the as-applied challenge, the government primarily argued that the statute regulates conduct and not speech. The court disagrees, saying even if a law does not directly target speech, it’s subject to First Amendment scrutiny if it restricts access to a public forum. Notwithstanding the court’s conclusion that the statute restricts speech, the court says the statute is not content-based and it’s certainly not viewpoint-based. The statute must satisfy a substantial government interest and a close fit. The court declines to resolve the issue at the pleading stage, but does point out that the statute’s trespass or theft concerns are not served by restricting plaintiffs who have no intent to harm or access data that is not otherwise made public.
Plaintiffs also mounted a petition clause challenge because the statute restricts them from informing Congress or agencies of the results of their research (i.e., discrimination by websites). They also assert that CFAA restrictions prevent people from accessing courts to enforce discrimination statutes. The court says these result from post-access restrictions that are outside the scope of the statute. In any event, their Petition Clause claims overlap with their Free Speech claims.
Finally, the court rejects the plaintiffs’ vagueness and non-delegation arguments.
This decision is ground breaking. More like earth shattering. You need only to get to the court’s statement on scraping (“scraping plausibly falls within the ambit of the First Amendment”) to see this.
The court’s forum analysis is of course is new, and a big leap from the Supreme Court’s decision in Packingham. I’m not aware of any other court applying a forum-analysis to the internet, or portions of it. Compare this with Judge Koh’s ruling in Prager University v. Google which Eric blogged about.
The court struggles with the question of whether the sites in question are “public” websites in the sense that they are not password-protected. The court seems to take the position that something that is accessible only to a registered user is nevertheless “accessible to the public”. The court does not delve into the treatment of this issue by the hiQ and 3Taps cases.
It’s tough to know the practical consequences of this ruling. The dispute is headed to discovery, and assuming plaintiffs prevail, this means that some set of access does not violate the CFAA, provided it’s engaged in for research purposes? One odd part of the ruling is that the court does not discuss in detail the effect of plaintiffs’ status as academics. Is the lack of a profit-motive what separates them from any other scraper who does not intend to effect harm on a website by freeing its information?
The court also does not discuss other doctrines that stand in the way of scrapers. Are sites’ terms of service-based contract claims subject to the same analysis? What about technical measures that websites put into place. Will the court (similar to the trial court in hiQ) block these?
This dispute reminds me of hiQ, which also relied in part on Packingham. Like that ruling, this one’s viability on appeal is suspect. (The LinkedIns of the world are probably wondering if they can intervene here.)
Case citation: Sandvig v. Sessions, 2018 WL 1568881 (D.D.C. March 30, 2018)