My Comments on Google Print

By Eric Goldman

I have been thinking a lot about Google Print, Google’s plan to scan in and index all of the books of some of the largest libraries in the world. This program has some obvious benefits to society; so much good content is “invisible” to the world because it’s locked in a dead trees delivery mechanism, and the search costs of finding that information overwhelm the value of doing so. With Google Print, a lot of the world’s knowledge will become newly discoverable by a large part of society.

At the same time, there’s a small little problem–copyright law. Google Print requires lots of copies to be made, and those copies may have legal significance under copyright law. Google Print displays only a small portion of text in each search result, but this small portion could still be copyright-protected, and certainly lots of copies of copyrighted works are made prior to the display of that small portion.

If Google restricted itself just to works in the public domain, then there would be no copyright problem; but Google has much more ambitious plans to get every book it can find, copyright protected or not. To “allay” publisher concerns, Google has made a public offer to publishers that they can opt-out of Google Print–instead of limiting themselves to publishers who opt-in or otherwise going through the cumbersome steps of getting publisher permission. Needless to say, publishers have not been thrilled by the offer that they can “opt-out” when they are still wondering what permits Google to launch this program at all.

My thoughts on this subject are a little jumbled, in part because my normative biases are in conflict with my dispassionate legal assessment. My heart says Google Print is great and therefore we should interpret copyright law in a way to permit it. Unfortunately, my head says that this is highly suspicious under most readings of copyright law.

While I’ve been working through my internal struggles and deferring a blog post until I resolved my thoughts, William Patry posted on the subject. His post is brilliant, and it says everything I was thinking in a far better way than I could say myself. He says sharply:

“The chutzpadik manner in which Google has gone about this is breathtaking, and indeed what they have done so far is, in my opinion, already infringing, that is the copying of the books even without making them available.”

Meanwhile, Patry takes a position directly contrary to Jonathan Band’s recent analysis/advocacy document, where Jonathan writes:

“The Print Library Project is similar to the everyday activities of Internet search engines….As a practical matter, each of the major search engine companies copies a large (and increasing) percentage of the entire World Wide Web every few weeks to keep the database current and comprehensive….Significantly, the search engines conduct this vast amount of copying without the express permission of the website authors. Rather, the search engine firms believe that

the fair use doctrine permits their activities. In other words, the billions of dollars of market capital represented by the search engine companies are based primarily on the fair use doctrine.”

Band’s response is problematic, isn’t it? One way of interpreting Jonathan’s response is: hey, the search engines steal daily, so what’s a little more stealing?

More problematically, this passage reveals the underlying weakness of the current search engine model. It contravenes my first rule of business ventures–NEVER BUILD A BUSINESS ON FAIR USE–and exposes that we do not have very good legal precedent validating the practice. Indeed, consider the case law validating the search engine practices:

* Kelly v. Arriba. The Ninth Circuit found that displaying thumbnail versions of photographs was fair use. The Ninth Circuit also originally found that displaying the full-size versions of the photos would not be fair use, although the Ninth Circuit (18 months later) realized that it had ruled on a question that neither party had litigated, and it withdrew that part of the opinion. The resulting mess of the case is so confusing and questionable as precedent that I don’t teach the case in Cyberlaw.

* Ticketmaster v. Tickets.com. This case specifically validated the practice of using robots to collect content over the Internet and present some of that content as search results. However, the precedent is limited by the fact that Tickets.com presented only unprotectible facts in its “search results”–this may clearly distinguish search engines, especially with features like their “cache” where the search engines present 100% of the underlying content.

Consider UMG v. MP3.com, a case that Patry notes. There, MP3.com copied all of the contents of various copyrighted CDs and threw them onto servers. Customers could access songs only if they could establish that they owned the CDs. MP3.com styled this as a way of space-shifting. The resulting legal opinion isn’t pretty for MP3.com. You know it’s going to be bad when the opinion starts off:

“The complex marvels of cyberspatial communication may create difficult legal issues; but not in this case. Defendant’s infringement of plaintiff’s copyrights is clear.”

Meanwhile, we have a number of lawsuits that attack search engine practices squarely, including Perfect 10 v. Google, Perfect 10 v. Amazon, and AFP v. Google. I remain nervous that these cases may very well expose the cracks in the search engines’ legal foundation. If these cases don’t, then I think some future plaintiffs eventually will, and the search engines will have to run to Congress to solidify their position. (Fortunately, Google now has the money to spend like a high roller in DC, and that goes a long way).

Like many other situations in copyright law, there’s no question that I wish copyright law did not stand in the way of Google Print (or many of the other great services that Google offers, including Google Images, AutoLink and Google News, that I use daily but are also litigation-bait). But I remain very dubious of commentary about copyright law that is influenced by the commentator’s normative views.

In particular, I remain very skeptical of anyone who prospectively declares a business venture to be well-protected under fair use. Even the smartest copyright lawyers in the world (and Patry and Band certainly are contenders for that status) simply have no way of predicting fair use in advance.

Therefore, given the shaky basis for its efforts, I think Google would be well-advised to adopt a more conciliatory approach to Google Print. Personally, I think Google would succeed if initially it simply limited itself to public domain works plus those of publishers who opt-in. After a few months of operation, Google will then have data to show how being included in Google Print can drive sales. At that point, I bet every publisher will line up at Google’s door to opt-in (and, in fact, would gladly pay for inclusion). Because Google can get to exactly the same place with a velvet glove rather than guns ablazin’, I’m surprised Google is still fighting the hard fight.

UPDATE: There have been a ton of articles on Google Print. Today’s salvo is from Anick at AP. The story says: “Jim Gerber, Google’s director of content partnerships, says the company would get no more than 15 percent of all books ever published if it relied solely on publisher submissions.” I’d like to see some backup on this, and I further wonder how many of the unsubmitted books are going to be searched for–and how many of these books could be transacted in the marketplace.

UPDATE 2: CE Petit points out, correctly, that there may be circumstances where the publishers do not have the rights from the authors to opt-in to Google.

UPDATE 3: Like we couldn’t see this coming. As anticipated, Google has been sued. The complaint. Plaintiffs’ press release. Google’s response. AP and Reuters stories. My advice to Google: stand down (for more on this, see Mike Madison’s post and my comment to it).

UPDATE 4: IPTABlog does a good recap of the discussion.