An Interesting Post
Apropos for the previous post (on Darwinian adaptation among malware), the article itself attracted one of those keyword-matching comments from an apparent spamblog (somewhat different from straightforward splogs). I had not previously heard of these before operating this blog in other than stealth mode, so here’s how I infer they function just by observation:
- A new post is scanned, either via its feed or one of the aggregator services like Technorati, looking for certain keywords.
- A corresponding post is created on the spamblog with a generic blurb like “[author] had an interesting post about [keyword]” and a short 1- or 2-line excerpt centering on the keyword match.
- A comment is submitted to the originating blog, linking back to the spamblog.
- The spamblog post is then able to attract traffic either through clickthroughs from the comments thread, or from increased PageRank from Google since their blog gradually increases its network of keyword-linked sites.
The ultimate purpose is still simply to gain visitors which in turn trigger ad revenue through a combination of Google text ads, banner ads, and other pay-to-host content. The spamblog itself is often a default template, e.g. the Kubrick WordPress theme, consisting only of these short linked posts. For blogs that either don’t moderate comments or who don’t scrutinize excerpting sites individually, growth is mostly automatic. The adaptation is that they propagate links without the prior telltale markers of comment spam like overt sales messages included in the actual comment text.
So far I’ve seen these keyword comments triggered by an unusual set of terms: ‘elevator operator,’ ‘turquoise jewelry,’ a ’sequel to 5 People You Meet in Heaven,’ ‘Apple,’ ‘zebrafish,’ and ‘plumbing license.’ As an exercise for the reader, I leave it to you to guess which original posts generated each of those matches (hint: keywords don’t have to be sequential). I’m also curious whether having listed those now all together, I will get a repeat entry of all prior spam comment attempts.
This brings to mind what I am sure has already been codified into the equivalent of Sturgeon’s Law, which would go something like: “Any sufficiently popular mainstream communications system will generate spam” or perhaps the more prescriptive, “A communication system can be considered mainstream once it attracts spam.” Spam is generally considered to have originated with electronic systems like e-mail and Usenet forums, but extending the definition backwards, one could potentially designate parallels like telemarketing and robocalls for telephones and junk mail for postal service as examples. Did telegraph operators ever suffer from unsolicited commercial Morse Code transmissions? Certainly spam has gained tremendous genetic diversity in jumping to every emerging communication form—chat spam (first IRC then IM), forum spam (first newsgroups then web), mobile phone spam via SMS, online games, search engines (aka spamdexing), blog spam, and even video-sharing sites like YouTube. Twitter? Check.
Part of the original blame can be placed on the idealism of academic groups like the IETF who established standards for communication protocols like SMTP and NNTP without incorporating more robust authentication and authorization to deter spoofing and other common tactics. Except, of course, that those standards were created long before the very notion of a commercial Internet had been considered, and the online community was small enough to police itself by etiquette alone. Certainly we could assert that newer protocols should learn the lessons of the past and instill greater protection against potential abuses, right? Except, instead, the rapid evolution of spam in response to antispam efforts has created ’superbugs’ and an extensive evolutionary toolbox of techniques that can thwart most any systemic precaution. Just like our immune system and pharmacology have developed to deal with ever more sophisticated organic threats, inspiring ever stronger virii and bacteria, so the race continues between platform developers and those who would distribute spam over them. It is effectively now almost impossible to create a communications system that is actually usable, capable of reaching mainstream acceptance, and totally immune to spam-like behavior. Instead, like the common cold, we now aim instead to reach a détente where we can take steps to prevent infection and minimize symptoms, but no longer envision a ‘cure for spam.’