Blog Content Aggregation, RSS Feeds and Copyright Law

At Search Engine Strategies, we discussed the problem of aggregators and spammers taking blog content and using it to build aggregation pages (with AdSense links, naturally) that compete with the source blog for traffic. In some cases, these aggregation pages present tightly focused content about high-value keywords.

In a perfect world of search, the search engines would be able to determine that these aggregation sites are spammers and block them accordingly. However, because the aggregation sites use legitimate content, the search engines can’t recognize them for what they are.

Indeed, a smart SEO can find a way to get these aggregation pages ranked pretty highly, even if they have no original content of their own. (At SES, there were some anecdotal stories that these aggregation sites were ranking higher than the source blog, definitely “diverting” traffic and presumably costing the source blog some money). While these aggregation sites may have limited in-bound links, and thus lower PageRank, PageRank is not the sole determinant of search engine placement. Indeed, the highly focused aggregation sites overcome the lower PageRank for certain keywords by having the keyword in the title, good keyword density, smartly controlled anchor text (through a website controlled by the aggregator), etc.

With the search engines fairly powerless to automatically detect and accurately grade these aggregation sites, bloggers need to do their own enforcement work. Many bloggers never check if their content is being recycled/stolen, so many aggregators can get away without any consequence.

If the bloggers do choose to pursue the aggregators, their principal tool will be copyright law. There’s no question that bloggers have copyright protection for most of their posts, and copyright is a powerful tool. For example, bloggers should be able to get the aggregators kicked out of the search engines or taken down by web hosts through 512(c)(3) notices.

However, there’s an underlying problem. Many bloggers make feeds of their blog content available for all comers through RSS (a/k/a “Really Simple Stealing”–a phrase generally attributed to Jason Calacanis). This raises a question that has become the buzz among lawyers and SEM–if a blogger makes a feed of his/her blog available, what can others do with that content?

In my mind, there’s no question that a blogger grants an implied license to the content in an RSS feed. However, because it’s implied, I’m just not sure of the license terms. So, in theory, it could be an implied license to permit aggregators to do whatever they want.

This may not be as ridiculous as it sounds. For example, I have no problem with Bloglines aggregating my feeds. Indeed, I think I have several dozen regular readers through Bloglines, so if I cut Bloglines off, I would lose a non-trivial number of my readers.

But if making an RSS feed available gives Bloglines an implied license to repurpose my blog content, how do I distinguish other aggregators? In other words, in aggregating my content into a web page, the spammer/aggregator is doing about the same thing as Bloglines. The only difference I can see is that Bloglines generally doesn’t make my blog content indexable to the search engines on its servers–although they might find other ways to make money off my content, such as serving ads.

The ads-on-third-party-blog-content issue was raised when Marty Schwimmer complained to Bloglines about having ads displayed on his content, and that led Bloglines to (I believe) block him and then take down the ads. (See a recap here). Fortunately, Marty’s blog is back in Bloglines, and I trust everyone reading my blog also reads his regularly.

It is trivial to destroy an implied license, so bloggers can overcome any aggregator use simply by saying so. I’m not sure WHERE the blogger would need to say this (by the “syndicate” link? in the xml feed itself?). Perhaps any disclosure in any reasonable place would be sufficient to destroy the implied license. I don’t think many bloggers are trying to destroy those implied licenses today, but that may be coming.

Meanwhile, if there truly is an implied license for aggregators, then technically sending DMCA takedown notices to the search engines would not be proper–at least, the blogger would need to notify the aggregator to destroy the implied license before a DMCA takedown notice would be proper.

The bloggers’ other main recourse would be copyright litigation. However, this is generally not a great option for bloggers for 2 reasons. First, it’s a hassle. Second, bloggers rarely if ever register their copyrights–at least, not early enough to obtain statutory damages and attorneys’ fees. (FWIW, I’ve never registered a copyright in my blog).

This led to an interesting suggestion from Jeffrey Rohrs. It would be a pain for very active bloggers to constantly register their copyrights in their blog, so it would be advantageous to them to make it easier to register. Fortunately, there’s a technological solution–bloggers already put out an RSS feed with all of the blog’s content. Maybe the Copyright Office could allow submissions of blog content for registration by reading an RSS feed.

In other words, the blogger would register the feed, and the Copyright Office could automatically pick it up regularly. This would be cheap solution for the Copyright Office and would allow bloggers to regularly register their content. With regular registration, bloggers would be eligible for statutory damages and attorneys’ fees and would get a huge club to go after the aggregators.

I think the Schwimmer/Bloglines fracas was a very minor skirmish in a much bigger battle against aggregators that will be played out. First, Bloglines backed down, and second, Bloglines is not as pernicious as the spammer/aggregators.

I’m pretty sure we’re going to see some big/high-stakes battles over the scope of any implied licenses in RSS feeds, and I honestly can’t predict where the lines will be drawn. In the interim, I suggest to bloggers who care that they put some restrictions near their “syndicate” link, add some restrictions to the RSS feed, register their copyrights and use the DMCA notice-and-takedown procedures when appropriate.