Facebook’s Anti-Spam Filter Blocks Legitimate Conversations about Power.com

July 26, 2010 · by Eric Goldman · in Content Regulation, Domain Names, Privacy/Security, Spam

By Eric Goldman

On Friday, Venkat and I posted about the latest ruling in Facebook v. Power.com. After Venkat or I make a blog post, I typically post the blog headline and URL to Twitter. I have enabled the app that makes my Twitter posts into my Facebook status reports as well, so the headline and URL on Twitter should automatically propagate to Facebook. On Friday, I tweeted the following:

“Blog Post: Important ruling on California’s anti-computer trespass statute–Facebook v. Power.com http://bit.ly/bM7hQT”

However, I noticed that the Twitter-to-Facebook app didn’t work properly and the headline didn’t appear. So I tried to manually enter the headline and URL and got this message from Facebook:

“This message contains blocked content that has previously been flagged as abusive or spammy. Let us know if you think this is an error.”

I do think that’s an error, and I reported the problem through Facebook’s automated reporting tool on Friday. Not surprisingly, I still haven’t gotten a response to that. But I was baffled how my headline and URL could have been “flagged as abusive or spammy.” Who flagged it? Why?

After a little more experimentation, I discovered that every instance of the character string “power.com” is blocked in Facebook. Therefore, every time I put “power.com” into my status reports or in comments to those status reports–even if it’s the only content in the post/comment–I get the “blocked content” message. However, it’s easily avoided; I can post “power . com” (notice the spaces before and after the period) just fine. Basically, Facebook is using a very dumb word filter.

I emailed my PR contacts at Facebook about this. They pointed to their anti-spam filter and this blog post from June. The blog post explains that “we’ve been working to improve our warnings and make them more clear” and that “people misunderstand one of these systems. They incorrectly believe that Facebook is restricting speech because we’ve blocked them from posting a specific link.”

So this is where things have gone wrong. Facebook told me it has blocked Power.com because “we found that Power was spreading links to its pages in a way that violated our Statement of Rights and Responsibilities. For example, when a Power user accessed Facebook, Power would automatically create an event on Facebook (typically called ‘Power.com Party’ or something similar) without the person’s knowledge or permission. It would then send invitations to all of the user’s friends.” Fair enough, and I’m glad Facebook is trying to keep its system safe for users.

However, Facebook’s dumb word filter block means that every reference to “power.com,” even if it’s in plaintext and not linkable, is still treated as a link and therefore is blocked as well. The messaging then disparages the plaintext reference as “blocked content that has previously been flagged as abusive or spammy” when, in fact, a link to the URL, not the plaintext reference I made, has been flagged. So much for clearer error messages.

I pointed out to Facebook’s spokespeople the difference between a plaintext reference to a company’s name (“Power.com”) and a spammy URL/link. Their response? “Spammers turning their malicious urls into plain text is the oldest trick in the book. Not blocking all of the variations of a bad URL leaves a gaping hole.”

There is a kernel of truth to this, of course. A plaintext URL is not materially different from an active hypertext link–if the user chooses to cut-and-paste the link into the browser (or right-clicks on it, or whatever). However, Facebook’s method of blocking spammy links by blacklisting every instance of the character string actually has the effect of blocking *every* discussion of a blacklisted company with the name [noun].[tld]. Because the main word in the name is a noun (e.g., “Power”), referencing the name without the TLD can lead to semantic ambiguity. However, the system prevents me from using the complete name (Power.com) because it can’t distinguish between a link and a plaintext reference to a company’s name that acts as a URL. I received a private email that another Facebook user encountered a similar block with the string seppukoo.com, the Facebook suicide tool.

In my case, the net consequence is that Facebook automatically blocks any conversations involving the string “power.com”–including my headline to my blog post–and provides an error message telling me that I am posting spammy/abusive content when I try to make the posting, which makes me feel like I did something wrong. With all of the bright engineers at Facebook, I bet they could figure out a way to more precisely tune the filter so that a plaintext reference to [noun].[tld] gets through while active links to that URL, or more fulsome plaintext URLs, remain blocked.

That is, assuming Facebook actually wants to enable Facebook users to talk about Power.com or Seppukoo.com or other enterprises that threaten the Facebook franchise. Frankly, I haven’t seen much evidence of Facebook’s interest in those conversations. In light of Power.com’s antitrust challenges against Facebook, the fact that Facebook’s system suppresses legitimate conversations about Power.com (whether it had a censorious intent or not) struck me as particularly noteworthy.

Facebook’s Anti-Spam Filter Blocks Legitimate Conversations about Power.com

Comments and Pings