Monday, May 15, 2006

There's one born every minute: spam and phishing

It's been a little while since I launched where people perform a spam filtering task and their results are compared against best of breed spam filters. I set out to make sure that the spam filters were doing a good job on the assumption that people would be able to spot errors that the filter was making.

Bad assumption. It turns out, based on preliminary data, that people suck at spam filtering. Here's some initial figures: people agree with 89.1% of the classifications that they've examined. Now that could mean that the original spam filter sucked, but guess again!

Ignoring all the emails that have only been voted on once, and looking at the emails that have been seen by multiple people (who've agreed that they believe that the message is a ham or a spam), there are some really surprising results:

Here's one that people think is a spam:

and this one too:

and many people think this US Airways message is spam:

Now for the prize winning classification. The people who thought the following phish was a genuine message, could you please forward your bank account details and PIN to me so that I can deposit your prize in your account:

Happily, people are finding genuine errors that the spam filter made. For example, this really is a genuine message from Travelocity and not a spam:


If you enjoyed this blog post, you might enjoy my travel book for people interested in science and technology: The Geek Atlas. Signed copies of The Geek Atlas are available.


<$BlogCommentDateTime$> <$BlogCommentDeleteIcon$>

Post a Comment

Links to this post:

<$BlogBacklinkControl$> <$BlogBacklinkTitle$> <$BlogBacklinkDeleteIcon$>
Create a Link

<< Home