FRAMINGHAM (09/25/2003) - After worms, spam is easily the bane of existence for an increasing number of users - corporate and home alike. It is more than a full-time job to keep track of all the new spammer tricks.
Devising tests to probe anti-spam behavior is a daunting task - but that doesn't mean we shouldn't try.
At first blush, it seems like a hopeless endeavor. After all, with spammers expending significant effort to find ways around whatever barriers are put in place, by the time one test is developed there are sure to be new methods of spamming out "in the wild." Some say that because testing can never hope to mirror reality, no useful tests can be developed. I disagree.
We can apply concepts developed to test less-amorphous things - LAN switches for example - to provide a basis for useful evaluation of anti-spam. For starters, one could dismiss the last decade or so of switch/router testing for the "it-does-not-reflect-reality" reasons stated previously. Indeed, while we test these devices, usually with streams of same-size packets, nobody contends that this is how networks work in the wild.
We do it for two fundamental reasons. First and foremost, such testing provides a way to get direct points of comparison across products. While one could argue that comparing the results of tests that are "illogical" (that is, not real-world) is illogical, that is not the case. At its simplest, a test can show the raw processing capabilities of multiple devices - an important element. More importantly, the results can give key insights into the underlying architecture of the product and thus how it is likely to behave in the wild.
And this is where we can get the benefit with testing anti-spam solutions. With countless companies and countless underlying architectures and implementations, it becomes difficult to judge products on their own merits. By devising structured tests and applying them across products/services, key elements can be exposed.
We've done just this. For several reports that will be released soon, we devised a set of tests that exercised anti-spam products/services - and produced some interesting results.
Given the variety of spam that any product will have to handle, coupled with the various addressing attributes used by spammers, we knew that we'd be looking at using thousands of test messages. While logistically challenging, it does provide the benefit of delivering thousands of data points and thus more reliable results.
We started by "collecting" some thousands of examples of actual spam, loading them into a database and classifying them into categories based on the content. Among these we built a category of "benign" messages. These were real messages that should get through and not get snagged by the anti-spam product. Then we used a custom "spam-generator" program that we built to send these messages to the system under test using both original and "faked" FROM addresses and variations on TO, CC and BCC fields.
With this fundamental test, we saw significant differences in behavior. We even saw anti-spam offerings that, inexplicably, didn't filter out messages that had "ADV" (advertising flag) in the subject line. As expected we gained insights into the underlying product architectures - and performance - and proved to our satisfaction that testing spam is not a futile endeavor.
Tolly is president of The Tolly Group, a strategic consulting and independent testing company in Manasquan, N.J. He can be reached at firstname.lastname@example.org.