Friday, October 20, 2006

Why OCRing spam images is useless

Nick FitzGerald forwards me another animated GIF spam that takes the animation plus transparency trick I outlined in the blog post A spam image that slowly builds to reveal its message to a new level. And it shows why spammers will work around OCR as fast as they can.

Here's what you see in the spam image:



Looks simple enough until you take a look at the GIF file that actually generated what you see. It's animated and it has three frames:





The first image is the GIF's background and is displayed for 10ms then the second image is layered on top with a transparent background so that the two images merge together and the image the spammer wants you to see appears. That image remains on screen for 100,000 ms (or 1 minute 40 seconds). After that the image is completely blanked out by the third frame.

My favourite touch is that it's not the entire image that's transparent, not even the white background, but just those pixels necessary to make the black pixels underneath show through. If you look carefully above you can see that some of pixels appear yellow (which is the background color of this site) indicating where the transparency is.

That is darn clever.

Monday, October 16, 2006

A spam image that slowly builds to reveal its message

Nick FitzGerald sent me a stunning example of lateral thinking on the part of a spammer. The spammer has taken a standard stock pump-and-dump spam image and split it horizontally into strips.



Each of the 17 horizontal strips cuts fairly randomly through the text making OCR on each strip not very useful. The spammer has then mounted each strip in its correct position on a transparent background and put each strip into an animated GIF. Here, for example, are a couple of strips:




The end result is that only once the entire image animation has completed is the complete spam visible making this a challenge for spam filters. And the spammer has thrown in a couple of frames at the end of the image, that get displayed after such a long delay (8 minutes) that they essentially never get shown. But those final frames are there just to throw off a spam filter trying to find the actual image.

Here's what gets displayed:



and here's the final image in the animation:



Very clever! (I'm calling this 'Strip Mining')

Monday, October 02, 2006

Ye Olde OCR Buster

Regular spam-correspondent Nick FitzGerald writes with an example of a spam that he believes is trying to get around both hash busting and OCR in an image.



The image has random dots in the bottom left hand corner to mess up hashing of the GIF itself, and the fonts used are badly rendered unusual fonts.