January 16, 2007
Haifa University Search Engine Conference Recap
Haifa Conference Recap
On December 21, Haifa University Faculty of Law conducted an interesting cross-disciplinary and international inquiry into the law of search engines, with a whirlwind tour of about two dozen 10-minute presentations over a long 11 hour day.
In my opinion, our inquiry was hindered by a lack of a clear definition of “search engines.” One speaker told us to give up the pretense that we were discussing search engines generally and instead admit that Google was the real subject of the conference. While Google may dominate international markets even more than it dominates the US market, focusing the discussion on just Google does not advance the discussion much. With the portalization of Google, it was never clear if speakers were focusing on the paradigmatic keyword searches of a robotically generated content database or the broader portalized services.
This raises a fundamental question that never got adequately answered at the conference: can there ever be a specialized law of search engines if we can’t define search engines? Consistent with my usual skepticism, I wonder if search engine exceptionalism helps or hinders the discussion.
Speakers were allowed to select their own topics under extremely broad headings, and this left a number of topics virtually unaddressed: the words click fraud, trespass to chattels, GEICO, Rescuecom or 1-800 Contacts were mentioned only in passing or not at all. The fact that we could go almost 11 hours without reaching these topics shows the richness of search engine law.
Keep reading for my notes about each speaker (see also Stefan Bechtold’s recap of the conference):
(Note: if these squibs seem, at times, brief and general, recall the format of 10 minute presentations).
Niva Elkin-Koren (Haifa University Faculty of Law), How Search Engines Design the Public Sphere. She argued that search engines resemble other mass media players in their impact on public discourse.
Karine Barzilai-Nahon (University of Washington Information Sciences Department). She talked about how search engines act as gatekeepers, but their gatekeeping role constantly changes. She also argued that search engines do not have a monolithic approach to dealing with stakeholders.
Sheizaf Rafaeli (Haifa University Graduate School of Business). He noted six trends (1) we are moving from one-way communications to two-way communications, (2) interactive communications v. reactive communications, (3) broadcasting v. narrowcasting and the rise of the Long Tail, (4) linear communications v. non-linear communications, and disintermediation v. reintermediation. He drew 3 conclusions: (1) reintermediation creates a target for regulators, (2) advertising business models signal the end of a “free lunch,” and (3) concern about SEOs tricking search engines.
Judit Bar-Ilan (Bar-Ilan University Information Sciences Department), PageRank v. PeopleRank. She discussed the phenomenon of Googlebombing. Her conclusion is that individual actors can’t manipulate rankings much, but coordinated groups can. She thinks search engines should give lower weight to blog links accordingly.
Nico van Eijk (University of Amsterdam Institute for Information Law), Search Engines and Freedom of Expression. He discussed the difficulty of classifying search engines because of their common attributes with both telecommunications services and information services. Because they fit between these two regulatory classifications, he said that they are “lost in law.”
Einat Amitay (IBM Research Center, Haifa), Queries as Anchors. She discussed her efforts to optimize IBM’s intranet search function. Her approach was to look at query reformulations (she said 30% of the intranet’s searches are reformulations) and treat the final selected link as the “right” result and the rejected searches as the wrong ones. Thus, if a later searcher submitted one of the reformulated queries, the search results could also contain the “right” result as evidenced by the prior searcher’s behavior. She said that this approach improved the delivery of the right result the first time from 50% to 71%.
Bracha Shapiro (Ben-Gurion University Information Science Department), Social Search Engines. She discussed their experimental search engine “MarCol” that attempts to take advantage of crowd wisdom and that varies results based on people we trust. [Eric’s note: this sounds very much like how we sorted reviews at Epinions!]
Shulamit Almog (Haifa University Faculty of Law). She discussed whether search engines change the way we write as lawyers.
Helen Nissenbaum (NYU Culture and Communications Department), TrackMeNot. Easily the most contentious talk of the day! Helen discussed the software application that she developed with an NYU graduate student called TrackMeNot. The software automatically sends random queries to search engines so that any search engine tracking the user would have a mix of legitimate and bogus search results. She said that it has been downloaded 140,000 times, but this reflects multiple downloads by the same user getting new releases. She explained how the software reflected “Values in Design”:
1) the software was completely under the user’s control
2) Usability. The program is lightweight, easy to use, and configurable.
3) Privacy by obfuscation. Rather than trying to hide searches, everything is in plain view. Thus, the software works so long as the search engines can’t tell who is using it.
Does it work? Helen doesn’t know. She doesn’t think it will protect a user from a targeted investigation, but it might help prevent search engines from building customized profiles and from piecing together search results and deducing identity.
Helen’s talk exposed a divide between the IR people and the privacy advocates. Privacy advocates view the software as a great way to restore user control over their information. Meanwhile, the IR people believe in the power of databases, and thus junk data in the databases hurts their ability to accomplish their goals. (Nimrod called it “pollution,” which sparked some rebukes for his value-loaded choice of terms, and Daphne said that it creates a tragedy of the commons). Einat suggested that just like websites can use a robots.txt file, searchers should be able to use a users.txt file to tell search engines if they don’t want to be tracked. [Eric’s comment: I think this is just a variation on P3P, which was a huge failure, but at least Einat’s suggestion would allow users to opt-out without destroying the server logs for everyone else.]
Personally, I am not a fan of TrackMeNot. I called it a spambot because the software basically sends spam to the search engine databases (a later commenter said that my calling Helen a spammer was like calling her a terrorist--I was sure we were going to fulfill Godwin’s Law at that point). I believe TrackMeNot creates at least 2 negative externalities: (1) it consumes bandwidth and server resources, thus imposing costs on a variety of IAPs and the search engines, and (2) by preventing the search engines from being able to create profiles for anyone, it hurts those searchers who would derive value from search engines’ ability to rely on their server logs to improve the service.
However, even if the software creates negative externalities, that doesn’t mean that it’s a bad idea. Helen would respond that search engines impose costs on searchers by undercutting their privacy. I don’t agree with this critique, and I know search engines wouldn’t agree either. For example, Google would point out that users can opt-in to Google’s personalized search, and if they don’t, Google’s ability to profile them is very limited. However, my main beef is that I’m not clear how the Values in Design theory justifies correcting one set of perceived costs (the perceived degradation of privacy) by creating new costs (wasted packet processing and degradation of server logs as a reliable dataset for those searchers who want search engines to improve the service for them).
Nimrod Kozlovski (University of Tel Aviv), Proactive Policing—A New (Legal) Role for Search Engines. He discussed changes in how behavior is policed. Policing used to be reactive, but now it’s becoming proactive through online techniques like entrapment, traffic analysis for crime control, prediction of abnormal behavior and criminal monitoring. Search engines are well-positioned to aid policing—they can block websites, block terms, provide security alerts of risky sites, and monitor searches. He wonders about the implications of turning search engines into private police.
Itai Levitan (Easynet) gave a brief tutorial on SEM/SEO.
Daphne Raban (Haifa University Graduate School of Business), Social and Economic Incentive: the Google Answers Information Market. She was conducting an interesting analysis of profit-maximizing behavior in Google Answers before Google pulled the plug on the service.
Ariel Katz (University of Toronto Faculty of Law), Search Engines, Copyright and the Limits of Fair Use. He discussed the possibility that collective rights organizations could reduce the negotiation costs between search engines and rights owners. If search engines predicate their infringement defenses on high negotiation costs, CROs could eliminate that defense. However, he thinks fair use is better policy approach than CROs because of the social value that search engines generate by reducing “information overload externalities” as discussed by Frank Pasquale.
Irit Haviv-Segal (University of Tel Aviv Faculty of Law). She discussed the history of Lexis and Westlaw and the future of legal information databases. This talk sparked a lot of conversation. IR folks love to talk about the merits/demerits of structured search databases like Lexis/Westlaw vs. loosely structured databases like Google.
Joris van Hoboken (University of Amsterdam Institute for Information Law), Search Engine Neutrality. He sketched out both sides of the debate about search engines’ roles as media: their role in providing access to harmful content, their possible biases in sorting content, the possible “winner take all” effect of search results placement, and whether there are enough checks and balances (law, technology, market forces) to hold search engines accountable for their decisions.
Ziv Bar-Yossef (Technion Electrical Engineering Department and Google Israel Research Labs), External Evaluation of Search Engines via Random Sampling. He discussed research (before he joined Google) into the database size wars. Size matters because it indicates comprehensiveness, the ability to serve narrow topic queries, and prestige. But, size isn’t the only measure of quality; others include freshness, topical bias, presence of spam results and inclusion of unsafe results (i.e., clicking on them takes users to viruses). However, there is no widely accepted benchmark for measuring search quality.
According to his study from last year, if Google’s database size = 1, Yahoo’s was a 1.28 and MSN was (if I recall correctly) 0.78. Further, Google linked to approximately 2% dead pages, while Yahoo and MSN linked to about 0.5% dead pages. Based on these numbers, a possible inference is that Google’s database is smaller and more junky than Yahoo’s. However, Ziv did not attempt to exclude spam pages from the study, so another inference is that Yahoo does a poorer job screening out spam pages.
Tal Zarsky (Haifa University Faculty of Law), Search Engines and Media Concentration Policy. He asked two interrelated questions: do search engines moot media regulation, and should media regulation policy apply to search engines? This led to a series of interesting questions, including:
• do search engine promote diversity, competition and localism?
• do people really substitute the Internet for other media?
• Even if they do, is there concentration within the Internet itself?
• What other information retrieval methods do online searchers use?
Avishalom Tor (Haifa University Faculty of Law). Search engines accomplish pro-competitive objectives: they increase the flow of information and expand advertising options. But is there only good news? From an antitrust perspective, they raise a number of issues. What is the relevant market--End users? Information publishers? Advertisers? Search engines also pose a regulatory challenge because the markets change so rapidly. For example, will search engines blur into the browser market? He asked a really interesting and provocative question: if Google and Microsoft wanted to merge, would this be a problem, and if so, in what market?
Charlotte Waelde (University of Edinburgh School of Law), Search Engines and Litigation Strategies. She asked whether it’s more accurate to think of search engines as free riders or fair innovators. Her two paradigmatic examples were Google Book Search and keyword ad triggering. She raised concerns about the implications of transborder search engines and differences in legal rights. For example, assume that displaying book snippets in Scotland is not an infringement, and the book scanning takes place in the US and is deemed fair use. In that case, would Google be free from liability in Scotland, even if the book scanning wouldn’t be fair use/fair dealing if performed in Scotland? [Eric’s note: I just picked Scotland as an example] She wonders if this will lead to regulatory competition and forum shopping by search engines.
I’ve given up tracking international keyword cases (too many of them, and too confusing), but she mentioned two interesting points on this front. First, she said that defense and plainitff wins in keyword cases are running 50/50 in Germany; it appears that courts just can’t seem to agree anywhere in the world. She also indicated that a French court apparently has issued an order that expects Google will proactively inform AdWords advertisers which keywords are protected by trademark law. This would be an interesting development—right now, Google’s Sandbox (at least, in the US) already prompts advertisers to buy competitors’ trademarks without any warning. Yet, in theory, Google could integrate the registered trademark database into its ad placement tools.
Stefan Bechtold (Max Planck Institute of Common Goods) discussed search engines and opt-out. He said that one could style the opt-out approach as a weakening of property rights, but it’s unclear if this is good or bad. With respect to Google Book Search, he asked whether the best option is for a private intermediary to offer it, or maybe it should be offered by the government or by the copyright owners themselves. He observed that copyright law struggles with emerging intermediaries that use unusual revenue streams. He also observed that search could become more decentralized through search models like P2P file-sharing. This would increase the transaction costs of rights negotiations because there are more players.
Michael Geist (University of Ottawa Faculty of Law) discussed how search engines promote access, and how legal doctrines can protect search engines for doing so (230, 512, cases like Kelly v. Arriba and Field v. Google). However, increased access has unintended consequences. First, search engines increase the transparent society, and second, search engines capture search queries, which are the “database of intentions.” He raised three concerns about the limits of legal regulation. (1) a privacy protection regime focused on protecting just personal information may not go far enough because otherwise-anonymous data bits can be strung together. (2) In an opt-in/opt-out approach, consent may not be clear enough. (3) In some cases, simply creating liability may not go far enough.
David Gilat (practicing lawyer at Reinhold Cohn Group), Trademark Protection and Keyword-Triggered Advertising. He discussed Matim Li v. Crazy Line Ltd., a July 2006 Israel district court opinion holding that Google Israel isn’t liable for keyword triggering. He thinks that even though we can develop tools to enable searching, it doesn’t mean we should use them. He also was concerned that liberalized trademark rules in cyberspace may migrate back to liberalize trademark rules in physical space. [Eric’s comment: I kept my mouth shut on all of this. I’ve never found much upside from trying to correct the multitude of erroneous assumptions made by plaintiff-oriented trademark lawyers.]
Eric Goldman (Santa Clara University School of Law), Search Engines and Transaction Costs. As regular readers of this blog know, there are so many interesting angles of search engine law that it was hard to pick just one. I could have done a recap of my Deregulating Relevancy article to explain what we know about searcher behavior and how this undercuts the emerging trademark doctrines in cyberspace. Or, I could have done a more practitioner-oriented analysis of the latest in keyword law. (It turns out that my audience was comprised heavily of students, so they may have actually preferred that). Or, given that click fraud rarely got discussed at the event, I could have presented a click fraud recap.
Instead, I decided to do a topic that was a little out of my normal schtick because of the general theme of search engines and IP law. I thought it would be worth reconsidering search engines’ general approach that they can do whatever they want so long as they provide an opt-out. I find this argument fascinating because it’s difficult to identify other IP doctrines where a secondary user can mitigate liability by providing an ex post cessation of activity.
So this raises a basic question: if there aren’t good precedents, is there some reason that we should create search engine exceptionalism? To consider this, I ran through a typical but irresolute Coasean analysis to show that we might conclude that search engines must live in an opt-in regime across all of their IP usage—even if this means that search engines ultimately cannot be profitable, in which case there really wouldn’t be an economically viable niche for search engines after all costs are considered. Everything in my gut tells me this isn’t the right outcome because of the positive spillovers that search engines have on consumer search costs. However, unless search engines are truly unique in this respect, maybe the conclusion that secondary users should be allowed to mitigate liability through an opt-out regime should not be limited just to search engines, but might apply more broadly to other secondary users as well.
Tal correctly pointed out that property boundaries are inherently elastic, so even if there is a lack of precedent for opt-out regimes, that doesn’t mean much. Eli Salzberger (Dean of the Haifa University Faculty of Law) also noted his article with Niva that says the Coase Theorem doesn’t adequately account for the change in respective avoidance costs that will occur in the future as technology changes (also correct).
Conclusion. The conference website has more details about the event, including PDFs of a number of the presentations.
Posted by Eric at January 16, 2007 11:50 AM | Search Engines
TrackBack URL for this entry: