Friday, June 26, 2009

Running the numbers on the BBC executives expenses

So, another lovely data set appeared, the expenses of senior BBC executives. And the papers went a little wild highlighting the spending that they don't like.

Me, I just punched the numbers into a little program and took a look at how well they fit Benford's Law. I wanted to see if there were any interesting anomalies to look at. And there are. No smoking guns, though. Just some fun on a Number 22 bus with some numbers.

First, here are the chi-squared values for the fit of first digits of the expenses for 2007/2008 by executive. The critical value (for p = 0.05) is 15.51, so most of the expenses do not fit Benford's Law. The fun is in finding out why.

Ashley Highfield,3.06874517880956
John Smith,4.61752636949634
Mark Byford,15.5817014457229
Jana Bennett,15.6178982350545
Zarin Patel,17.5034731417114
Jennifer Abramsky,20.8803339214804
Mark Thompson,22.4588511455346
Caroline Thomson,37.666988157616
Timothy Davie,143.433388695613
Stephen Kelly,178.662639451409

The best fit is Ashley Highfield's lovely curve:

We'll come back to Mr Highfield later, but let's go to the other end of the spectrum and look at the extremes that don't match the expected. The 'worst' offender is Stephen Kelly (he's not with the BBC anymore). Here's his curve.

Whoa. What happened with all those 8s? Delve into the data and you find lots of £8.00 claims for "Road/Bridge Tolls". My guess is that Mr Kelly passed through the London Congestion Charging Zone in his own car. That's enough to skew the data. And if you match up his £8.00 charges and his mileage claims it all makes sense.

Now to Timothy Davie and here's his curve:

So, he's like Stephen Kelly and sure enough there are lots of £8.00 charges for the same "Road/Bridge Tolls".

Next on ths list comes a different pattern created by Caroline Thomson. An excess of 1s:

She's got a ton of taxi trips in the £10 to £19 range. According to Transport for London you'd see those fares on a weekday when traveling around 4 miles in central London. Given the location of BBC Television Centre it's pretty easy to imagine the need for these trips. Also, she doesn't claim any mileage or congestion charge so she's not using her own car.

Next up is the Big Kahuna Mark Thompson. His curve shows an excess of numbers 6, 7 and 8.

Why is that? Well, if you look at his expense claims line by line (and if you do you're a total nerd) you'll see that Mr Thompson takes people out to lunch a lot and spends a lot of money on lunches under £100. You can imagine this being totally legitimate. He probably has to do that for his job, there could be a BBC guideline about how much to spend on lunch, or Mr Thompson could simply have a moral compass that says he shouldn't go totally wild on lunch costs.

He doesn't have lots of taxis or tolls, but then again there's a note in his expense report where he did take a taxi that says "Driver not available" so I'm guessing he has a chauffeur.

And so it goes on. You can carry on down the list and look for little anomalies, but there's nothing glaring.

So, how come Ashley Highfield has such a perfect curve? Well, he doesn't take a lot of cabs (so no excess of 1s), drives his own car (lots of little mileage claims) and doesn't seem to claim the congestion charge (no excess of 8s). Did he forget to claim the congestion charge, or does he drive an electric car?

He was, after all, the BBC's Director of the Future Media and Technology.


Simon Zerafa said...

Hi John,

Great blog entries; keep them comming. I found you via GRC.COM and the GRC Forums.

On Benford's Law; how does this cope with prices for a wide range of goods and services often end with .95 or .99 (or .88 in Asia)?

How does adding VAT (either 15% or 17.5%) affect Benford's Law?

If I were filing expenses claims then I might expect to see more 8's and 9's above 10% of the time due to this pricing practice.

Kind Regards

Simon Zerafa

John Graham-Cumming said...


Benford's Law applies mostly to the first digit of numbers (although there are extensions to subsequent digits) so the endings shouldn't matter.

VAT doesn't make a difference. Applying some mathematical function to a number (e.g. adding tax) doesn't affect the actual distribution.

The biggest thing you'll see with expense claims is where employees hit a limit. Suppose I tell you you can spend up to £50 on lunch. We're unlikely to see many 1s because the limit skews the data.

Recommended reading: The Effective Use of Benford's Law to Assist in Detecting Fraud in Accounting Data.


tjp said...

Quite impressive, you got to a stage where the conclusions are _almost_ meaningful :)