Friday, January 20, 2006

Deconstructing Sundance with POPFile

The brave folks at Unspam have taken POPFile and years of data surrounding the Sundance Film Festival in attempt to predict the outcome of the 2006 edition. Specifically, they want to get invited to parties, and they've chosen to geek out on data, bend POPFile to their use (boy, that must have been hard work) and try to predict the hits from the festival.

They claim an 81% accuracy over the past festivals and have a nice web site giving their predictions for this year. The predictions also include a breakdown of how the decisions are made indicating the most important (and worst) words to appear in a review of a movie.

There's also movie metadata like the type of film it was shot on, or who reviewed it. That's a very interesting use of POPFile and if they get it right and are invited to all the cool parties I hope they fulfill my wish and put a good word in for me with Neve Campbell.

Tuesday, January 17, 2006

The WMF SetAbortProc problem is not a backdoor

Microsoft recently fixed a problem wherein a Windows MetaFile (WMF) could contain arbitrary code that would be executed when the WMF was "played". The problem was particularly bad because a WMF could be played automatically in Internet Explorer by referencing it in an IFRAME if the Windows Picture and Fax Viewer was installed and registered (which by default on recent versions of Windows it was), because that program would automatically handle WMFs and play them to display them.

Steve Gibson suggested in a podcast, which then became a big news story, that he was convinced that this functionality was in fact an intentional backdoor inserted by Microsoft for their own purposes (or at last for the purpose of a rogue engineer inside the company).

Microsoft responded to that accusation with a blog posting by Stephen Toulouse in which he gave some more details of the problem and essentially said that Gibson was wrong (without naming him).

I've looked carefully at this and am convinced that Gibson is wrong. This is nothing more than a bug caused by the reimplementation of a legacy API. The same bug appears in WINE and was recently fixed.

Gibson based his backdoor claim on two things (both of which he now admits were incorrect): that a special incorrect value was required in one of the metafile fields to get the backdoor to activate and that the code ran in its own specially created thread. In fact, this bug occurs with correct or incorrect values in the field and uses the same thread.

Secondly, it's pretty clear from the code that this occurs because metafiles can contains a special ESCAPE record that allowed them to call into the Windows API called Escape which has a function to perform SetAbortProc. If you take a look at the WINE code you can see how this exploit works there and I bet (because it's the same API) that the same or very similar thing is happening in Windows:

First the file dlls/gdi/metafile.c contains a function called PlayMetaFileRecord with the following signature:

BOOL WINAPI PlayMetaFileRecord( HDC hdc, HANDLETABLE *ht,
METARECORD *mr, UINT handles )

Which is simply WINE's implementation of the same Win32 API (which is documented here: /library/en-us/gdi/metafile_1yec.asp)

The third parameter (mr) is a METARECORD pointer (a METARECORD is just an entry in the metafile and is detailed here: /library/en-us/gdi/metafile_8j1u.asp) and is the all important header with the following definition:

typedef struct tagMETARECORD {
DWORD rdSize;
WORD rdFunction;
WORD rdParm[1];

With the rdSize being the size of the record in words, the rdFunction being the function and the rdParm the data (which in the case of an exploit would be executable code). PlayMetaFileRecord handles META_ESCAPE like this:

Escape( hdc, mr->rdParm[0], mr->rdParm[1],
(LPCSTR)&mr->rdParm[2], NULL);

You'll note that parameter 3 is a pointer into the metafile parameter block, i.e. if executed parameter 3 would execute code in the metafile. Now Escape has implemented like this (dlls/gdi/driver.c):

INT WINAPI Escape( HDC hdc, INT escape, INT in_count,
LPCSTR in_data, LPVOID out_data )

and the SETABORTPROC is handled with the following code:

return SetAbortProc( hdc, (ABORTPROC)in_data );

So if you have an ESCAPE/SETABORTPROC record in a metafile then under WINE the AbortProc is set to point into the metafile (since in_data is corresponds to &mr->rdParm[2]).

So it's quite clear from the WINE implementation that this is a way to set a pointer into the metafile for execution. All it would take is that the metafile's AbortProc is called and arbitrary code could be executed.

In WINE at least this looks nothing like an intentional backdoor. It looks more like a bug caused by the fact that Escape is rather powerful and can set a pointer to code.

Now it's possible in WINE (I believe) to force the AbortProc to execute with another ESCAPE record that has NEWFRAME as the function. Again looking at the Escape code you'll see that NEWFRAME has handled like this:

return EndPage( hdc );

EndPage is a standard GDI function (see here for documentation: /library/en-us/gdi/prntspol_0d6b.asp). If you take a look at the implementation in WINE you see the following code (dlls/gdi/printdrv.c):

ABORTPROC abort_proc;
INT ret = 0;
DC *dc = DC_GetDCPtr( hdc );
if(!dc) return SP_ERROR;

if (dc->funcs->pEndPage)
ret = dc->funcs->pEndPage( dc->physDev );
abort_proc = dc->pAbortProc;
GDI_ReleaseObj( hdc );
if (abort_proc && !abort_proc( hdc, 0 ))
EndDoc( hdc );
ret = 0;
return ret;

Note that this function always called the AbortProc of the DC. So I think a metafile with an ESCAPE/SETABORTPROC followed by ESCAPE/NEWFRAME would in WINE causes arbitrary code execution.

Now if you read this article from MSDN: you learn the following about the AbortProc
The SetAbortProc function (and the SETABORTPROC escape) sets up what is known as the AbortProc. This AbortProc function resides in the application; GDI calls it during a print job to inform the application of spooler errors and to allow the application to abort the job when desired. GDI calls the AbortProc function with information about why it is being called; this value is either an error code from the spooler or zero, which indicates that the function is being called simply to allow an abort.

The AbortProc function is called routinely during several steps of the printing process:

* After every write to the printer port when printing directly to the printer (no spooling)
* After every write to a file when printing directly to a file (no spooling)
* After every write to the spooler file when spooling
* Periodically when out of disk space for spooling as a result of other spool jobs
* Before playing every metafile record when GDI is simulating banding
* Occasionally from some older printer drivers

When GDI calls the AbortProc function, the application can continue the print job by returning a nonzero value or abort the print job by returning zero.

In there it mentions that the AbortProc will get called when playing a metafile record. Stephen Toulouse all but said that when he said: "The way this functionality works is by registering the callback to be called after the next metafile record is played.".

So my take is that Gibson is wrong, very wrong. This is no backdoor. It's just a side effect of the ESCAPE/SETABORTPROC handling in a metafile and the fact that the metafile processing is calling the AbortProc for you.

I've used Steve Gibson's WMF_dbg.exe and WinDBG to step through the implementation of PlayMetaFileRecord and especially how it handles the Escape function and it appears to be implemented in exactly the same fashion as the equivalent WINE function. Here's my commented disassembly:

77f493fd 8d4b0a lea ecx,[ebx+0xa]

; ecx now contains &mr->Param[2]
; When entering this structure edi is a pointer to the out parameters
; for the Escape and seems always to be null. So the next instruction
; pushes the last parameter of Escape (LPVOID lpvOutData) as NULL.

77f49400 57 push edi

; Pushes the LPCSTR lpvInData parameter from ecx which is pointing to
; &mr->Param[2] which is inside the metafile the is being played

77f49401 51 push ecx

; Now load ecx and eax with the size of the input structure and the
; function number for the escape (int cbInput and int nEscape). In the
; case of a SETABORTPROC nEscape/eax is 9 and is taken from the
; mr->Param[0] and cbInput/ecx is from mr->Param[1]. Note that it
; doesn't matter if cbInput is correct of not for SETABORTPROC.

77f49402 0fb74b08 movzx ecx,word ptr [ebx+0x8]
77f49406 0fb7c0 movzx eax,ax
77f49409 51 push ecx
77f4940a 50 push eax

; The final parameter is the original DC passed to PlayMetaFileRecord

77f4940b ff7508 push dword ptr [ebp+0x8] ss:0023:0006ff0c=2621029e
77f4940e e8113c0000 call GDI32!Escape (77f4d024)
77f49413 e981060000 jmp GDI32!PlayMetaFileRecord+0xd19 (77f49a99)

By varying the metafile function and parameters I've verified that all that's happening in PlayMetaFileRecord is that when it encounters an ESCAPE it does the same thing as WINE and extracts cbInput, nEscape and lpvInData from the METARECORD and calls GDI Escape.

Sunday, January 15, 2006

Do spammers fear OCR?

Nick FitzGerald recently sent me two sample spams that seem to indicate that some spammers fear that using images in place of words isn't enough. They've started to obscure their messages to prevent optical character recognition.

The first spam appears to be a scan of a document that's been skewed slightly. Now this could be a simple and bad scan.

But the second is even more interesting. It appears to be perfectly normal:

Until you look at the fact that this was actually constructed using <DIV> tags for layout and the breaks between the lines are in the middle of words. Here Nick has kindly inserted borders showing that the words are broken horizontally and then put back in the right position:

But is anyone doing OCR, or are spam filters getting good enough that the spammers are being really paranoid about what they are sending?

The funny think about the second example is that the URL they include is not obfuscated, is clickable and appears in the SURBL :-) So despite the effort to obscure the content a simple check of the spamminess of the URL gets this email canned.

Monday, January 09, 2006

Python is the new Tcl; Ruby is the new Perl

Given how bad I am at predictions I'd like to give you the following (which you'd better take with a grain of salt): Python is the new Tcl and Ruby is the new Perl. My prediction for 2006 is that most Perl 5 programmers will decide that Ruby looks cool and learn the language and that Python will be a niche language that a die hard core continues to love and embellish.

I think Ruby's the new Perl because of the confluence of three things: Perl 6, RubyGems and Ruby's own Perliness. Firstly, given how hideous Perl is if you're a Perl person (like me) why learn Perl 6 when you could learn Ruby? Ruby's core language is clean, regular expressions are strong and there are many Perl influences (which Ruby is slowly removing to ensure that language doesn't become Perl). Secondly, RubyGems will do for Ruby what CPAN did for Perl.

Python's lack of good package management (if you ignore ActiveState's PyPPM and the recently started Cheese Shop) and idiosyncratic language constraints ("hey we liked signficant white space in Make so much we used it ourselves") make it look like Tcl to me. And where's the killer app? Arguably dynamic web pages were Perl's killer app, and perhaps Rails is Ruby's. Python has... Zope?

All of which leaves me wondering how to complete "PHP is the new..."

(Naturally LISP is the new LISP).

Making an old USB printer support Apple AirPrint using a Raspberry Pi

There are longer tutorials on how to connect a USB printer to a Raspberry Pi and make it accessible via AirPrint but here's the minimal ...