One of the growing complaints surrounding the blogosphere is the ever-growing threat of blog spam. Blog-spam normally appears in two forms: comment spam and content spam. Comment spam is much like a email spam because it comes from outside sources and tries to trick people into click-thrus.
However, the other type of blog spam, which seems to be growing more and more prevalent are blogs that feature content spam designed to game search engines for ranking benefits. These are normally user-created, although, the use of automated content scraping in order to populate the blog with content seems to be the favored method. Some feel that blog spam is becoming such a problem (see Mark Cuban), that companies owning or offering blog services should take steps to prevent the spread of blog spam.
In order to see just how widespread the use of blog spam is becoming, Philipp Lensen of Google Blogoscoped conducted a test of Google-owned Blogger.com blogs in order to get enough information to present a cross-section of data showing how many Blogspot-hosted blogs are actually spam blogs. In order to conduct this test, Philipp used the following URL:
www.blogger.com/redirect/next_blog.pyra?navBar=true
Whenever this URL is entered into the browser, the user is taken to a random Blogspot blog, which changes each time the URL is entered. This method provided Philipp with enough random sites to form an observation about Google’s blog service. To get a fair estimate, Philipp selected 50 random blogs (using the URL method) and annotated their contents. Of the 50 blogs tested, Philipp found that 60% of them (or 30 out of 50) were spam blogs. This certainly helps support Cuban’s thoughts about Google’s service being the leading provider of blog spam; an observation that undoubtedly pleases Google to the core.
It is important to note that just because Philipp tested 50 blogs and 30 of them were guilty of blog spam, that doesn’t mean these numbers are a true reflection of the entire Blogger content. If Philipp’s observations were extrapolated across all of the Blogger-hosted pages, 4 million pages out of 7.5 million could be categorized as blog spam. Whatever the numbers, Philipp’s test shows that Google’s adoption of the Captchas precaution comes at a crucial juncture.
Read more about Philipp Lensen’s blog spam test.
Chris Richardson is a search engine writer and editor for Murdok. Visit Murdok for the latest search news.