Tuesday, June 26, 2007

Pretty Darn Fancy: Even More Fancy Spam

Looks like the PDF wave of spam is proceeding with a totally different tack. Sorin Mustaca sent me a PDF file that was attached to a pump and dump spam. Unlike the previous incarnation of PDF spams, this one looks a lot like a 'classic' image-spam.

It's some text (which has been misaligned to fool OCR systems), but there's a little twist. Zooming in on a portion of the spam shows that the letters are actually made up of many different colors (a look inside the PDF it's actually an image.

I assume that the colors and the font misalignement is all there to make it difficult to OCR the text (and wrapping it in a PDF will slow down some filters).

Extracting the image and passing it through gocr gave the following result:

_ Ti_ I_ F_ _ Clih! %

%_ % _c. (sR%)
t0.42 N %x

g0 _j_ __ h_ climb mis _k
d% % %g g nle_ Frj%.
_irgs__Dw_s _ rei_ � a
f%%d __. n_ia une jg gtill
oo_i_. Ib _ _ _ _ _ _ 90I

And a run through tesseract resulted in:

9{Eg Takes I:-west.tY`S Far Gawd Climb! UP
fatued Sta=:khlauih. This we 1: still

Close, but no cigar.


JoeChongq said...

Now I am getting this type of PDF spam too for the same SERA stock, but the text, layout, and size of the image are all different. And the last line now says I should be doing this on Wednesday. The attachment filename and subject were Warning.pdf.

r said...

First pdf spam hit my box couple of days back. So far 4 mails came with same SERA content. filename, numbers in the text and day in the last line keep changing between each. Though the Trendmicro Interscan messaging security passed them through, Gmail is tagging them as spam!