Skip to main content

The WMF SetAbortProc problem is not a backdoor

Microsoft recently fixed a problem wherein a Windows MetaFile (WMF) could contain arbitrary code that would be executed when the WMF was "played". The problem was particularly bad because a WMF could be played automatically in Internet Explorer by referencing it in an IFRAME if the Windows Picture and Fax Viewer was installed and registered (which by default on recent versions of Windows it was), because that program would automatically handle WMFs and play them to display them.

Steve Gibson suggested in a podcast, which then became a big news story, that he was convinced that this functionality was in fact an intentional backdoor inserted by Microsoft for their own purposes (or at last for the purpose of a rogue engineer inside the company).

Microsoft responded to that accusation with a blog posting by Stephen Toulouse in which he gave some more details of the problem and essentially said that Gibson was wrong (without naming him).

I've looked carefully at this and am convinced that Gibson is wrong. This is nothing more than a bug caused by the reimplementation of a legacy API. The same bug appears in WINE and was recently fixed.

Gibson based his backdoor claim on two things (both of which he now admits were incorrect): that a special incorrect value was required in one of the metafile fields to get the backdoor to activate and that the code ran in its own specially created thread. In fact, this bug occurs with correct or incorrect values in the field and uses the same thread.

Secondly, it's pretty clear from the code that this occurs because metafiles can contains a special ESCAPE record that allowed them to call into the Windows API called Escape which has a function to perform SetAbortProc. If you take a look at the WINE code you can see how this exploit works there and I bet (because it's the same API) that the same or very similar thing is happening in Windows:

First the file dlls/gdi/metafile.c contains a function called PlayMetaFileRecord with the following signature:

BOOL WINAPI PlayMetaFileRecord( HDC hdc, HANDLETABLE *ht,
METARECORD *mr, UINT handles )

Which is simply WINE's implementation of the same Win32 API (which is documented here: /library/en-us/gdi/metafile_1yec.asp)

The third parameter (mr) is a METARECORD pointer (a METARECORD is just an entry in the metafile and is detailed here: /library/en-us/gdi/metafile_8j1u.asp) and is the all important header with the following definition:

typedef struct tagMETARECORD {
DWORD rdSize;
WORD rdFunction;
WORD rdParm[1];

With the rdSize being the size of the record in words, the rdFunction being the function and the rdParm the data (which in the case of an exploit would be executable code). PlayMetaFileRecord handles META_ESCAPE like this:

Escape( hdc, mr->rdParm[0], mr->rdParm[1],
(LPCSTR)&mr->rdParm[2], NULL);

You'll note that parameter 3 is a pointer into the metafile parameter block, i.e. if executed parameter 3 would execute code in the metafile. Now Escape has implemented like this (dlls/gdi/driver.c):

INT WINAPI Escape( HDC hdc, INT escape, INT in_count,
LPCSTR in_data, LPVOID out_data )

and the SETABORTPROC is handled with the following code:

return SetAbortProc( hdc, (ABORTPROC)in_data );

So if you have an ESCAPE/SETABORTPROC record in a metafile then under WINE the AbortProc is set to point into the metafile (since in_data is corresponds to &mr->rdParm[2]).

So it's quite clear from the WINE implementation that this is a way to set a pointer into the metafile for execution. All it would take is that the metafile's AbortProc is called and arbitrary code could be executed.

In WINE at least this looks nothing like an intentional backdoor. It looks more like a bug caused by the fact that Escape is rather powerful and can set a pointer to code.

Now it's possible in WINE (I believe) to force the AbortProc to execute with another ESCAPE record that has NEWFRAME as the function. Again looking at the Escape code you'll see that NEWFRAME has handled like this:

return EndPage( hdc );

EndPage is a standard GDI function (see here for documentation: /library/en-us/gdi/prntspol_0d6b.asp). If you take a look at the implementation in WINE you see the following code (dlls/gdi/printdrv.c):

ABORTPROC abort_proc;
INT ret = 0;
DC *dc = DC_GetDCPtr( hdc );
if(!dc) return SP_ERROR;

if (dc->funcs->pEndPage)
ret = dc->funcs->pEndPage( dc->physDev );
abort_proc = dc->pAbortProc;
GDI_ReleaseObj( hdc );
if (abort_proc && !abort_proc( hdc, 0 ))
EndDoc( hdc );
ret = 0;
return ret;

Note that this function always called the AbortProc of the DC. So I think a metafile with an ESCAPE/SETABORTPROC followed by ESCAPE/NEWFRAME would in WINE causes arbitrary code execution.

Now if you read this article from MSDN: you learn the following about the AbortProc
The SetAbortProc function (and the SETABORTPROC escape) sets up what is known as the AbortProc. This AbortProc function resides in the application; GDI calls it during a print job to inform the application of spooler errors and to allow the application to abort the job when desired. GDI calls the AbortProc function with information about why it is being called; this value is either an error code from the spooler or zero, which indicates that the function is being called simply to allow an abort.

The AbortProc function is called routinely during several steps of the printing process:

* After every write to the printer port when printing directly to the printer (no spooling)
* After every write to a file when printing directly to a file (no spooling)
* After every write to the spooler file when spooling
* Periodically when out of disk space for spooling as a result of other spool jobs
* Before playing every metafile record when GDI is simulating banding
* Occasionally from some older printer drivers

When GDI calls the AbortProc function, the application can continue the print job by returning a nonzero value or abort the print job by returning zero.

In there it mentions that the AbortProc will get called when playing a metafile record. Stephen Toulouse all but said that when he said: "The way this functionality works is by registering the callback to be called after the next metafile record is played.".

So my take is that Gibson is wrong, very wrong. This is no backdoor. It's just a side effect of the ESCAPE/SETABORTPROC handling in a metafile and the fact that the metafile processing is calling the AbortProc for you.

I've used Steve Gibson's WMF_dbg.exe and WinDBG to step through the implementation of PlayMetaFileRecord and especially how it handles the Escape function and it appears to be implemented in exactly the same fashion as the equivalent WINE function. Here's my commented disassembly:

77f493fd 8d4b0a lea ecx,[ebx+0xa]

; ecx now contains &mr->Param[2]
; When entering this structure edi is a pointer to the out parameters
; for the Escape and seems always to be null. So the next instruction
; pushes the last parameter of Escape (LPVOID lpvOutData) as NULL.

77f49400 57 push edi

; Pushes the LPCSTR lpvInData parameter from ecx which is pointing to
; &mr->Param[2] which is inside the metafile the is being played

77f49401 51 push ecx

; Now load ecx and eax with the size of the input structure and the
; function number for the escape (int cbInput and int nEscape). In the
; case of a SETABORTPROC nEscape/eax is 9 and is taken from the
; mr->Param[0] and cbInput/ecx is from mr->Param[1]. Note that it
; doesn't matter if cbInput is correct of not for SETABORTPROC.

77f49402 0fb74b08 movzx ecx,word ptr [ebx+0x8]
77f49406 0fb7c0 movzx eax,ax
77f49409 51 push ecx
77f4940a 50 push eax

; The final parameter is the original DC passed to PlayMetaFileRecord

77f4940b ff7508 push dword ptr [ebp+0x8] ss:0023:0006ff0c=2621029e
77f4940e e8113c0000 call GDI32!Escape (77f4d024)
77f49413 e981060000 jmp GDI32!PlayMetaFileRecord+0xd19 (77f49a99)

By varying the metafile function and parameters I've verified that all that's happening in PlayMetaFileRecord is that when it encounters an ESCAPE it does the same thing as WINE and extracts cbInput, nEscape and lpvInData from the METARECORD and calls GDI Escape.


Anonymous said…
Gibson was doing OK as long as he stuck with doing technical analysis on this issue, even though his analysis was incomplete/wrong. But it was when he decided to ascribe motives with insufficient evidence (no evidence at all, really) that he lost his way. The Security Now #22 podcast has Gibson saying, "But the only conclusion I can draw is that there has been code from at least Windows 2000 on, and in all current versions, and even, you know, future versions, until it was discovered, which was deliberately put in there by some group, we don't know at what level or how large in Microsoft, that gave them the ability that they who knew how to get their Windows systems to silently and secretly run code contained in an image, those people would be able to do that on remotely located Windows machines..." which is sensationalism at its best. When carefully parsed, the statement may be literally true, but it suggests that Microsoft added this code with the intent of remotely accessing PCs without authorization, and Gibson knew very well that he was suggesting just that.

In the newsgroup, Gibson has backpedaled, with suggestions that the code may have been put in for *benign* purposes like remotely assisting a user that has turned off things the more obvious avenues of remote assistance (e.g. Windows Update). Huh? This makes no sense. In other posts to newsgroup Gibson says, (paraphrasing) "the code was put in for reasons that we'll never know", which again suggests a nefarious plot of some kind. Gibson seems to be unfamiliar with Occam's Razor.

Lastly, Gibson accuses Microsoft of "dancing around the issue". What issue? There is no issue other than Gibson's own half-baked accusations.

Gibson should just admit that he was premature in making his accusations and be done with it.
Erik K said…
You are probably right that this is not an intentional backdoor. I'm not a security expert, but one thing is bothering me: If it's just due to general handling of the ESCAPE record, wouldn't you get an access violation error when executing the code. To be able to execute data you need to call VirtualProtect(..., PAGE_EXECUTE). Why would anyone do this on a WMF file? The only explanation that makes sense is Mark Russinovich's, that they made an explicit choice to support AbortProcs (which strikes me as extreme stupidity, different applications that render the same WMF could need different AbortProcs).
Anonymous said…
The reason you don't have to mark the WMF executable is because the x86 doesn't support the PAGE_EXECUTE flag. Any page that is readable is executable by default.

Some newer x86's (new Pentium 4 chips and some x64 CPUs running in 32-bit compatibility mode) have no such limitation. On those systems, if DEP (the software enforcement of PAGE_EXECUTE) is turned on, they will NOT be vulnerable to attack via an AbortProc whose code is actually in the WMF. On those systems, you will get an AV, exactly as you speculated.

The second point is that WMFs were primarily intended (and escaping in particular) for printing functionality, rather than being used directly in files. As a result, an application could easily specify its own AbortProc as part of a WMF "stream" being spooled to a printer.

WMF is a holdover from 16-bit Windows, where data was considered a trusted product of an application, rather than a potentially hostile source of infection.

Gibson simply isn't familiar with the law of "Never attribute to malice what can be explained by stupidity."

Or, in this case... ignorance. Windows 3.1 was when the WMF format was developed, and that system was developed with a design paradigm that suited the needs of the times but is woefully poor in handling today's security threats.

Popular posts from this blog

How to write a successful blog post

First, a quick clarification of 'successful'. In this instance, I mean a blog post that receives a large number of page views. For my, little blog the most successful post ever got almost 57,000 page views. Not a lot by some other standards, but I was pretty happy about it. Looking at the top 10 blog posts (by page views) on my site, I've tried to distill some wisdom about what made them successful. Your blog posting mileage may vary. 1. Avoid using the passive voice The Microsoft Word grammar checker has probably been telling you this for years, but the passive voice excludes the people involved in your blog post. And that includes you, the author, and the reader. By using personal pronouns like I, you and we, you will include the reader in your blog post. When I first started this blog I avoid using "I" because I thought I was being narcissistic. But we all like to read about other people, people help anchor a story in reality. Without people your bl

Your last name contains invalid characters

My last name is "Graham-Cumming". But here's a typical form response when I enter it: Does the web site have any idea how rude it is to claim that my last name contains invalid characters? Clearly not. What they actually meant is: our web site will not accept that hyphen in your last name. But do they say that? No, of course not. They decide to shove in my face the claim that there's something wrong with my name. There's nothing wrong with my name, just as there's nothing wrong with someone whose first name is Jean-Marie, or someone whose last name is O'Reilly. What is wrong is that way this is being handled. If the system can't cope with non-letters and spaces it needs to say that. How about the following error message: Our system is unable to process last names that contain non-letters, please replace them with spaces. Don't blame me for having a last name that your system doesn't like, whose fault is that? Saying "Your

The Elevator Button Problem

User interface design is hard. It's hard because people perceive apparently simple things very differently. For example, take a look at this interface to an elevator: From flickr Now imagine the following situation. You are on the third floor of this building and you wish to go to the tenth. The elevator is on the fifth floor and there's an indicator that tells you where it is. Which button do you press? Most people probably say: "press up" since they want to go up. Not long ago I watched someone do the opposite and questioned them about their behavior. They said: "well the elevator is on the fifth floor and I am on the third, so I want it to come down to me". Much can be learnt about the design of user interfaces by considering this, apparently, simple interface. If you think about the elevator button problem you'll find that something so simple has hidden depths. How do people learn about elevator calling? What's the right amount of