Ninth Circuit Says LinkedIn Wrongly Blocked HiQ’s Scraping Efforts

September 9, 2019 · by Venkat Balasubramani · in Privacy/Security, Trespass to Chattels

Fans of scraping cases may rejoice. The Ninth Circuit issued its long-awaited opinion in the hiQ v. LinkedIn case (it was argued in March 2018, so the opinion took about 18 months). It rules in favor of hiQ.

hiQ was a company that, apparently with LinkedIn’s authorization, accessed data from public LinkedIn profiles and built products on this data. After years of this practice, LinkedIn sent hiQ a cease and desist letter that hiQ was no longer authorized to access LinkedIn user data, so any ongoing access would violate the Computer Fraud and Abuse Act and other laws. hiQ preemptively sought a preliminary injunction. The district court granted the injunction and ordered LinkedIn to allow hiQ to access the data in question during the pendency of the lawsuit. It was a sweeping ruling that many thought would be unlikely to survive challenge on appeal. The Ninth Circuit upheld the ruling.

Preliminary injunction factors: The court concludes that hiQ’s allegation that it would go out of business absent a continuing ability to access the data sufficiently constituted irreparable harm. It also concluded that the balance of hardships tilts in hiQ’s favor. There were a few major points in this particular conclusion. The court discounts LinkedIn’s statements about its need to protect user privacy, pointing to LinkedIn’s own inconsistent statements reflecting intent to leverage user data. The court also says more emphatically that LinkedIn does not own the user data:

LinkedIn has no protected property interest in the data contributed by its users, as the users retain ownership over their profiles. And as to the publicly available profiles, the users quite evidently intend them to be accessed by others, including for commercial purposes

Tortious interference claim: The court says that hiQ likely makes out a plausible claim for tortious interference. It has third party contracts, and LinkedIn is taking action which plainly interfere with such contracts. LinkedIn countered that its interference (if any) was legally justified as it was just competing aggressively. The court says that California courts apply a balancing test to determine the propriety of such interference. Normal competition is allowed, but that’s not what LinkedIn did here. Moreover, hiQ pointed to possible anti-competitive motives in LinkedIn’s actions:

HiQ has raised serious questions about whether LinkedIn’s actions to ban hiQ’s bots were taken in furtherance of LinkedIn’s own plans to introduce a competing professional data analytics tool. There is evidence from which it can be inferred that LinkedIn knew about hiQ and its reliance on external data for several years before the present controversy.

The court also throws shade on LinkedIn’s purported interest in “protecting its members’ data”.

CFAA claim: The court also rejects LinkedIn’s argument that hiQ’s continued access violates LinkedIn’s rights under the CFAA. The court says it does not. The CFAA is aimed at safeguarding protected computers and information, and information that is generally publicly available does not fall into this category. Citing to its own pronouncements in the first Nosal case and Facebook v. Power Ventures, the court says that the phrase “without authorization” covers “information for which authorization or access permission, such as password authentication, is generally required.” The court cites to the legislative history and to the Stored Communications Act in support of this conclusion. The court also notes its divergence from other circuits who have interpreted the CFAA broadly, particularly in cases involving employee theft of data.

This is a powerful ruling in favor of hiQ. LinkedIn’s antitrust lawyers likely had their mornings interrupted by having to review and digest the court’s statements. Platforms appear to be increasingly nervous about antitrust issues, but we have not seen many cases actually asserting those types of claims. Could this ruling embolden plaintiffs?

The big question is where the case goes from here. LinkedIn is not the type entity to simply pack up and go home, so I assume, it will seek en banc review, and will likely continue to litigate the case in the district court (where the case has been stayed). I would say there are three big takeaways from the ruling.

First, the court’s statement about LinkedIn not having a property interest in the data is key. Platforms often try to make an argument that they’ve put in substantial time and energy and are trying to protect their property rights from third party scrapers. The court blows this up this argument.

The court’s close scrutiny of LinkedIn’s privacy arguments is also noteworthy. Privacy is another argument platforms try to rely on in articulating claims against aggregators. However, platforms are often two-faced about this. The court did not have to look far to find a statement from LinkedIn’s CEO hyping the value of consumer data and its plans to build products exploiting such data. The court’s discussion about privacy also drove home the point that consumers are often given little or no granular control over the information they share. In any event, justifications around third party access to information based on user privacy are not so easy for a platform to make.

As to the CFAA claim, this ruling implicitly overrules Judge Breyer’s (un-cited) ruling in the 3taps case where he said access following receipt of a cease and desist letter could violate the Computer Fraud and Abuse Act. Incidentally, craigslist filed an amicus brief in this case in support of LinkedIn. The court states generically that access to information has to be restricted in order to support a CFAA claim, but there wasn’t much discussion about what form the restriction had to take. I don’t know about LinkedIn’s specific practices for displaying user data, but I do know that it likely varies depending on whether the accessing user is logged in and/or connected to the user in question. Is this sufficient to constitute an access restriction? My recollection is that the parties did not focus on the precise nature of the restrictions so the court does not delve into it, but it would have been nice to see.

Finally, the court in a footnote says that it’s not opining on other causes of action. What about a possible breach of contract, trespass, or copyright claim? Those appear to remain viable to combat scraping. Certainly, removing the CFAA claims from the mix changes the risk calculus from the scraper standpoint.

This case involved data, which presents copyright challenges from the platform’s standpoint, but what about photos? I would urge aggregators to be cautious when scraping other types of content.

Eric’s Comments:

This is one of those cases where I’m not sure which side to root for. On the one hand, it’s tempting to root for David over Goliath–especially Goliaths who may be privacy-invasive themselves. Also, more efficient data flows can unlock new and innovative services and improve marketplace efficiency; and social networking services are likely to suppress third-party scraping for reasons that are not in the users’ best interests. As the court says:

giving companies like LinkedIn free rein to decide, on any basis, who can collect and use data—data that the companies do not own, that they otherwise make publicly available to viewers, and that the companies themselves collect and use—risks the possible creation of information monopolies that would disserve the public interest

So all of these considerations point towards supporting hiQ.

Yet, scraping can be bad for privacy. If LinkedIn can’t assert its users’ privacy interests against hiQ, then it’s possible no one can. In theory, the CCPA will give users some rights over hiQ’s data usage starting next year, but to what effect? There are likely dozens or hundreds of similarly situated entities that users can’t and won’t be able to find, so those CCPA rights won’t really benefit the users in practice. Meanwhile, if server operators can’t restrict who can access their servers, then it will embolden data scavengers–including trolls, malefactors, and governments–who intend to weaponize the data against users. So these considerations point towards supporting LinkedIn.

Plus, at least two considerations have irresolute implications. First, we’re not sure about the anti-competitive implications of this case. In theory, data scrapers and online publishers typically do not directly compete. However, as evidenced by the complaints by vertical search engines against Google, odds are pretty high that every data scraper can find some way to claim competitive positioning with their targets. Second, we have inconsistent social norms about how absolute “property” rights should be, which points in whatever direction that’s consistent with your normative views on property.

This melange of policy considerations made this an impossible case for the court. There was no way it could optimize the competing interests,. As an example of this balancing challenge, the court says:

even if some users retain some privacy interests in their information notwithstanding their decision to make their profiles public, we cannot, on the record before us, conclude that those interests—or more specifically, LinkedIn’s interest in preventing hiQ from scraping those profiles—are significant enough to outweigh hiQ’s interest in continuing its business, which depends on accessing, analyzing, and communicating information derived from public LinkedIn profiles

Regarding the property issues, the court says “LinkedIn has no protected property interest in the data contributed by its users, as the users retain ownership over their profiles,” but this is incomplete. LinkedIn could have compilation copyrights in its database; it might have joint copyrights (if it jointly authored works with its users); it has copyrightable elements on its web pages that hiQ necessarily “download” as part of the scraping/browsing process; and it has trademark rights in identifying the data source as LinkedIn and its users. But talking about LinkedIn’s data interests is a partial misdirection. The real property interest at issue here is LinkedIn’s server ownership and how expansively LinkedIn can control usage of its personal property.

That takes us to the court’s messy CFAA discussion–a terrible place to be, because the CFAA seems to vex the Ninth Circuit (such as the multiple rulings in the Nosal case). Here, the court’s analysis of what constitutes “without authorization” for CFAA purposes made my head hurt. In Facebook v. Power Ventures, the Ninth Circuit said that a C&D letter revoked any actual or implied authorization. LinkedIn sent hiQ a C&D. Case over, right? No. The court moves the goal line through sophistry: “Where the default is free access without authorization, in ordinary parlance one would characterize selective denial of access as a ban, not as a lack of ‘authorization.'” Thus, it distinguishes the Power Ventures ruling because Facebook’s data was allegedly displayed only to logged-in users [note: this is imprecise because users can publish their Facebook data to the world], while LinkedIn’s data was displayed to everyone.

The “without authorization” redefinition would be a powerful enough ruling, but the court blazes another new CFAA trail. The court says the CFAA was designed to “prevent intentional intrusion,” analogous to “forced entry” and “breaking and entering.” WHAT?? That’s an amazing twist, and one I would favor! In 2013, I argued that we should scrap all current online trespass jurisprudence and limit the doctrine to just that. But will this panel ruling permanently narrow the CFAA? Highly doubtful. More likely, the Ninth Circuit will redefine “without authorization” in the next CFAA litigation it touches.

The real question everyone wants answered is: can a data scraper force a server operator to let it scrape? This opinion doesn’t definitively answer the question. Among other things, the court didn’t evaluate all of LinkedIn’s legal claims–it only considered the CFAA issue. After those other claims are litigated, hiQ can still lose the case, and this opinion could become a historical curiosity. But for now, the headline will be that LinkedIn can’t stop scrapers–not what the case actually said, but likely how other courts will read this opinion until the Ninth Circuit cleans up the mess.

Case citation: hiQ Labs, Inc. v. LinkedIn Corp., No. 17-16783 (9th Cir. Sept. 9, 2019)

LinkedIn Enjoined From Blocking Scraper–hiQ v. LinkedIn

Scraping Lawsuit Survives Dismissal Motion–CouponCabin v. Savings.com

QVC Can’t Stop Web Scraping–QVC v. Resultly (Forbes Cross-Post)

Multiple Listing Service Gets Favorable Appellate Ruling in Scraping Lawsuit

Anti-Scraping Lawsuits Are Going Crazy in the Real Estate Industry (Catch-Up Post)

College Course Description Aggregator Loses First Round in Fight Against Competitor in Scraping Case — CollegeSource v. AcademyOne

Anti-Scraping Lawsuit Largely Gutted–Cvent v. Eventbrite

Interesting Database Scraping Case Survives Summary Judgment–Snap-On Business Solutions v. O’Neil

Facebook Gets Decisive Win Against Pseudo-Competitor Power Ventures — Facebook v. Power Ventures

Power.com Up For Auction — Facebook v. Power Ventures

Power.com Counterclaims Dismissed — Facebook v. Power Ventures