It's been a little while since I launched SpamOrHam.org where people perform a spam filtering task and their results are compared against best of breed spam filters. I set out to make sure that the spam filters were doing a good job on the assumption that people would be able to spot errors that the filter was making.
Bad assumption. It turns out, based on preliminary data, that people suck at spam filtering. Here's some initial figures: people agree with 89.1% of the classifications that they've examined. Now that could mean that the original spam filter sucked, but guess again!
Ignoring all the emails that have only been voted on once, and looking at the emails that have been seen by multiple people (who've agreed that they believe that the message is a ham or a spam), there are some really surprising results:
Here's one that people think is a spam:
and this one too:
and many people think this US Airways message is spam:
Now for the prize winning classification. The people who thought the following phish was a genuine message, could you please forward your bank account details and PIN to me so that I can deposit your prize in your account:
Happily, people are finding genuine errors that the spam filter made. For example, this really is a genuine message from Travelocity and not a spam: