A Closer Look at a Troubling Anti-Scraping Ruling from Spring–Compulife Software v. Newman (Guest Blog Post)
by guest blogger Kieran McCarthy
Compulife Software, Inc. v. Newman is the first circuit court case in more than half a decade to expand liability for web scrapers under state and federal law.
The two most recent circuit court opinions that addressed web scraping–hiQ Labs, Inc. v. LinkedIn Corp. a decision from the Ninth Circuit in 2019, and Sandvig v. Barr, a 2020 case out of the DC Circuit—both significantly limited the scope of liability for web scraping under federal law. Both of those cases reviewed the scope of liability for web scraping under the Computer Fraud and Abuse Act (“CFAA”). Compulife, an opinion from the Eleventh Circuit, by contrast, analyzed liability for web scraping in the context of copyright and trade secrets.
Almost certainly unintentionally, the court in Compulife takes the key holding of hiQ Labs, Inc. v. LinkedIn Corp., examines it in the context of trade secrets, and then flips it on its head. Whereas the court in hiQ Labs held that “accessing … publicly available data will not constitute access without authorization under the CFAA.” the court in Compulife says almost exactly the opposite. Namely, that scraping publicly accessible information, in large enough quantities, might indeed be a form of misappropriation of trade secrets.
At a high level, the Compulife court reached two main conclusions:
- That, “after an [copyright] infringement plaintiff has demonstrated that he holds a valid copyright and that the defendant engaged in factual copying, the defendant bears the burden of proving—as part of the filtration analysis—that the elements he copied from a copyrighted work are unprotectable,” (emphasis in original) and;
- Even if individual, discrete forms of data that are available to the public are not trade secrets, large amounts of them, in aggregate, might be trade secrets. Take enough of them, through copying or web scraping, and that conduct might be considered theft or misappropriation of trade secrets.
The copyright holding likely has limited implications for future web-scraping cases; the trade secrets holding, if followed by other courts, could have significant ramifications.
At a more granular level, this is a complicated case with complicated facts—and the end result is, perhaps unsurprisingly, complicated new legal precedent. For those seeking to litigate against web scrapers, the court’s reasoning will fuel new types of arguments in copyright and trade secrets claims. For those seeking to defend it, the highly fact-specific components may serve as a basis to try to limit the general applicability of the Court’s holding.
As is often the case in web scraping litigation, the dispute involves two business rivals. In this instance, the rivals are Florida companies whose principal business is to provide insurance quotes.
The facts of this case, at the highest level, are as follows:
Compulife maintains a database of insurance-premium information—called the “Transformative Database”—to which it sells access. The Transformative Database is valuable because it contains up-to-date information on many life insurers’ premium-rate tables and thus allows for simultaneous comparison of rates from dozens of providers. Most of Compulife’s customers are insurance agents who buy access to the database so that they can more easily provide reliable cost estimates to prospective policy purchasers. Although the Transformative Database is based on publicly available information—namely, individual insurers’ rate tables—it can’t be replicated without a specialized method and formula known only within Compulife.
The defendants in this case operate a website called “BeyondQuotes,” a competitor of Compulife’s service. At one point, the owners of this site hired a hacker to take Compulife’s data. Compulife claimed that the defendants didn’t even bother to produce their own quotes but simply reproduced Compulife’s data. That hack and resulting copying and competition led Compulife to file two separate lawsuits against the owners of BeyondQuotes. The two lawsuits were then consolidated at trial and remained so on appeal.
It is worth noting that this case does not include alleged violations of the Computer Fraud and Abuse Act (“CFAA”), the most commonly applied federal law to web scraping. Indeed, plaintiff refrained from pursuing many of the common claims that often appear in web scraping litigation, such as trespass to chattels, conversion, tortious interference with a contract, and unjust enrichment.
Instead plaintiff’s case focused on four claims: 1) copyright infringement; 2) trade-secret misappropriation; 3) false advertising, and 4) violation of Florida’s Computer Abuse and Data Recovery Act (Florida’s state-law equivalent of the CFAA).
The case went to trial before a magistrate judge and plaintiff lost on all four claims.
Plaintiff then appealed to the Eleventh Circuit. On appeal, the Eleventh Circuit vacated the judgment on the copyright and trade secrets claims, upheld the judgment on the false advertising and Florida anti-hacking law claims, and remanded with instructions to make new findings of fact and conclusions of law on the copyright and trade secrets claims.
The copyright claim hinged on the Court’s “filtration” analysis to determine whether there was substantial similarity between the copyrighted work and the elements copied by the defendants.
The trial court determined that that the plaintiff failed to prove substantial similarity. The Eleventh Circuit decided this was an error, not because the magistrate got the facts wrong, but because the trial court placed the burden on the defendant—rather than the plaintiff, as is standard and customary—to prove that “the elements he copied from a copyrighted work are unprotectable.”
Ultimately, the Court felt that “placing the burden to prove protectability on the infringement plaintiff would unfairly require him to prove a negative.”
The Eleventh Circuit felt that defendants would be in a better position to provide evidence of unprotectability:
If, for instance, the defendant believes that some part of the copyrighted work is in the public domain, he must narrow the inquiry by indicating where in the public domain that portion of the work can be found. Similarly, if he thinks that what he copied amounts to usual industry practice, he must indicate the standards that dictate that technique. The plaintiff then faces the manageable task of “respond[ing],” to the appropriately narrowed issue. Placing the burden on the defendant, therefore, isn’t just consistent with our own precedent and leading scholarly commentary, but also fairer and more efficient (citations omitted).
For scrapers, it may seem scary to learn that there is now burden-shifting in the filtration analysis of copyright law (at least in the Eleventh Circuit). But there are a few key facts here that might lead us to believe that the applicability of this precedent could be limited. First, this is a case where the plaintiff had a registered copyright in its database. That’s not that common.
Second, because the database was the core of its business model, the plaintiff was unusually protective of its database. It was so protective of the database that it added a digital watermark to it, which appeared on the defendant’s website. So unique and special was the plaintiff’s database, that the court called it the “Transformative Database.”
Most databases aren’t copyrighted. Most aren’t that transformative. Most instances of copying aren’t quite so brazen as the defendants’ here. These were all material considerations driving the outcome. As such, most plaintiffs are unlikely to share the facts that drove this result and may not be able to successfully rely on this case as precedent (though I’m sure plenty will try).
Trade Secrets Claim
Unlike the narrow applicability of the copyright claim holding, the Compulife Court’s holding with respect to trade secret claims is more broad-reaching and thus more likely to be revisited often and, particularly due to its view on public data, invite scrutiny.
What’s controversial is its holding that publicly available data, when taken in aggregate, can be trade secrets misappropriation. This seems like a bizarre and counter-intuitive holding. How can publicly available data be a trade secret?
The trial court concluded that the Transformative Database was a trade secret, but that the defendant hadn’t misappropriated it. The trial court decided that since the trade secret was publicly available, the defendants had no duty not to use it.
The Eleventh Circuit then piggybacked on the more questionable part of the trial court’s opinion and then vacated what seemed like the more logical part.
Some would argue that having a trade secret that is freely accessible to the public runs counter to a basic understanding of trade secrets law. To maintain a trade secret, you have to take reasonable precautions to maintain its secrecy. A freely-accessible-to-the-public trade secret would thus seem like an oxymoron.
Under the Florida Uniform Trade Secrets Act (“FUTSA”), the only elements of trade secret misappropriation are 1) possession of a trade secret and 2) misappropriation. Since the defendants won at trial, they didn’t appeal the magistrate’s conclusion that the database was a trade secret. Plaintiff got the conclusion it wanted on the first element of the FUTSA claim, but appealed on the second element, misappropriation. Since there was no dispute that the database was a trade secret and that the defendants had come into possession of parts of it, the only remaining question on appeal was whether it had been misappropriated. This may have led the appellate court to layer one dubious holding on top of another.
Which brings us to how the Eleventh Circuit came to its final decision. In deciding that the trade secret had been misappropriated, the Eleventh Circuit focused on plaintiff’s allegation that the trade secret had been acquired through “improper means.”
According to the Court:
As used in FUTSA, “[i]mproper means” is defined to include “theft, bribery, misrepresentation, breach or inducement of a breach of a duty to maintain secrecy, or espionage through electronic or other means.” Id. § 688.002(1). In the law of trade secrets more generally, “theft, wiretapping, or even aerial reconnaissance” can constitute improper means, but “independent invention, accidental disclosure, or . . . reverse engineering” cannot. Actions may be “improper” for trade-secret purposes even if not independently unlawful. (Citations omitted)
The Court then goes on to compare the facts this case to those in E. I. duPont deNemours & Co. v. Christopher, a 50-year-old trade secrets case from the 5th Circuit. 431 F.2d 1012, 1014 (5th Cir. 1970). In that case, the defendant took pictures of a methanol plant from an airplane and later used them to reverse-engineer the plant’s design. Though flying an airplane in public airspace isn’t illegal and taking pictures isn’t illegal, the court in Christopher decided that flying over a competitor’s building to take pictures to learn about their business was, in fact, grounds for a cause of action.
What does this have to do with web scraping?
Here’s what the Court said:
Although Compulife has plainly given the world implicit permission to access as many quotes as is humanly possible, a robot can collect more quotes than any human practicably could. So, while manually accessing quotes from Compulife’s database is unlikely ever to constitute improper means, using a bot to collect an otherwise infeasible amount of data may well be—in the same way that using aerial photography may be improper when a secret is exposed to view from above.
This is a dangerous precedent for future web scraping defendants. It reads like a potential indictment of the entire enterprise of web scraping. It seems to suggest that, in the context of accessing publicly available data, it may be illegal for a computer to access what is perfectly legal for a human to access. And that taking publicly available data, in large enough quantities, could be considered the theft or misappropriation of a trade secret.
The only question then, is how much data is too much data? The Eleventh Circuit conveniently punted the question back to the magistrate.
Consider how broadly the magistrate judge’s reasoning would sweep. Even if Compulife had implemented a technological limit on how many quotes one person could obtain, and even if the defendants had taken all the data, rather than a subset of it, each quote would still be available to the public and therefore not entitled to protection individually. On the magistrate judge’s logic, Compulife couldn’t recover even in that circumstance, because even there—in the magistrate judge’s words—“any member of the public [could] visit the website of a Compulife customer to obtain a quote” with “no restriction” on the subsequent use of the quote….
The magistrate judge treated the wrong question as decisive—namely, whether the quotes taken were individually protectable. He left undecided the truly determinative questions: (1) whether the block of data that the defendants took was large enough to constitute appropriation of the Transformative Database itself, and (2) whether the means they employed were improper. Having found that the Transformative Database was protectable generally, the magistrate judge was not free simply to observe that the portions taken were not individually protectable trade secrets.
We express no opinion as to whether enough of the Transformative Database was taken to amount to an acquisition of the trade secret, nor do we opine as to whether the means were improper such that the acquisition or use of the quotes could amount to misappropriation. We merely clarify that the simple fact that the quotes taken were publicly available does not automatically resolve the question in the defendants’ favor. These issues must be addressed on remand.
The Court treats this is as a slippery slope. But of course the slippery slope goes in both directions. Just as a bright-line rule permitting the scraping of publicly available data potentially leaves plaintiffs without much recourse in the event of a large taking, a fuzzy, ill-defined standard that casts doubt on the propriety of web scraping without further clarification leaves scrapers open to sweeping allegations of trade secret misappropriation. And since the determination of how much scraping is too much scraping is a question of fact, defendants may struggle to dispatch with those allegations short of trial.
This case is a significant departure from most prior circuit court opinions on web scraping. It doesn’t involve the CFAA, it seems narrowly tailored to the facts of the case, and it hinges on novel, creative, and expansive interpretations of intellectual property law. And since it’s the first circuit court opinion in a half a decade to expand, rather than limit, the scope of liability for web scrapers, it’s unlikely we’ve seen the last of it.
Case citation: Compulife Software Inc. v. Newman, Nos. 18-12004, 18-12007. (11th Cir. May 20, 2020)