Search Engines and Privacy…AGAIN?! and the Associated Press both ran stories last week about the possible ways that Google aggregates user data in a way that theoretically threatens privacy.

Hmm…this sounds familiar…haven’t we heard this story before? Yes, only about a thousand times. Danny Sullivan asks why we obsess about Google and privacy and ignore how other search engines (such as Yahoo) also have rich databases of potentially equal magnitude.

Indeed, I was going through my notes over the weekend and came across this March 2005 AP article fretting about how Amazon might use its customer database. The search engines-and-privacy story seems to endlessly cycle through the press, pretty much every time a search engine adds a new feature that uses personal data. (I won’t even revisit the mind-numbing press about Gmail from last year).

I offer three propositions about search engines and privacy:

1) Search engine databases can be accessed by government agencies through legal processes. In rare cases, other private parties could use a legal process to access information in these databases too. Search engines are not alone in this regard; any business that has personal information about its customers is susceptible to these legal processes as well. It’s true that search engines have particularly interesting/rich data, but plenty of other vendors have interesting data too.

So search engines aren’t the problem; the problem is government snooping. As a result, perhaps new legislation would be appropriate to raise the bar on when the government can tap into search engine databases (a little like the “Bork bill” that raised the bar for accessing video rental histories).

2) Search engine databases are a tempting target for hackers. This is true, but once again, search engines are not unique in this regard. Every business that maintains personal data about its customers is a hacker’s target. As a result, we need businesses to take prudent actions to prevent hacking, and we need government enforcement against illegal hacks. Nothing new here on any front.

3) Search engines will necessarily need to obtain and use personal data to reach the next rung of delivering relevant results. Right now, the biggest limitation inhibiting search engines is that they use a “one-size-fits-all” relevancy algorithm, designed to satisfy majority interests rather than personalized to each person’s interests. Google has done a remarkable job with relevancy using a one-size-fits-all algorithm, but it (and its competitors) will make quantum improvements in relevancy when they personalize the searches. To personalize the searches and really give searchers what they want, search engines will need to collect and use rich personalized datasets. This is a good thing for searchers.

Thus, from my perspective, social welfare will improve in these situations. I can’t wait for Google and other search engines to start reading my mind (as opposed to making guesses about majority interests). Let’s hope that the constant whining/scaremongering about search engines and privacy doesn’t delay us in getting there.

Prior blog post on this topic.

UPDATE: Google has blacklisted reporters for one year because of the story linked to above.