Last night I couldn't sleep because I was thinking about all the juicy data in the newly released MPs' allowances. Unfortunately, the government chose to releases these as scanned PDFs which makes them open but hard to work with.
Nevertheless, I figured it would be fun to grab a couple of MPs' allowances, type them into a spreadsheet and then run a little test with them. I chose two MPs: Alistair Darling (because he's the Chancellor and keeps tabs on everyone's money) and Harriet Harman (because she is listed as one of the cheapest MPs of all).
I obtained the PDF files for 2007-2008 here and here and entered into a spreadsheet the amounts listed on the expense claim forms line by line (excluding totals and not delving into receipts for additional breakdowns).
Then I wrote a little program to do the Benford's Law analysis of the first digits of these line items. From that I was able to plot the expected frequency of each digit and the actual.
Here's the data for Harriet Harman:
You can see that the actual figures and the predicted follow along together rather nicely and if you do a chi-squared test to check for the correlation between the two you get a figure of 6.78 which indicates that it's not possible to reject the notion that "the figures in Harman's expense claims follow the Benford's Law distribution".
So, then I turned to the Chancellor and got a rather different chart:
And the chi-squared test comes out at 18.19 indicating that his expenses aren't following Benford's Law.
So then you have to ask yourself why? And in particular why does the Chancellor have "too many" 3s and 4s. The answer may lie in the expense claims themselves.
If you pop into the Chancellor's Additional Cost Allowance PDF you'll see that he claimed exactly £300 for food in May, July, September, October, November and December 2007.
The figures are exact and there are no receipts because they are not required for food bills. So, it's probably the case that these extra six number 3s account for the fact that he's got six more 3s than expected.
Now what about the fours, there are eleven more of those than expected. Note that these expected numbers don't have to match exactly the actual numbers. The idea is just to follow Benford's Law fairly closely.
So what accounts for the extra 4s? Perhaps the six occurrences of an exactly £45 telephone bill that he's claimed for.
Now, I'm not trying to claim that Mr Darling's expense claim is fradulent, but Benford's Law is interesting because it can tell you where to go look for oddities. And the oddest thing I see is that the parliamentary rules allow an MP to make expenses claims without receipts. I know that I couldn't get away with an expense claim without receipts in my company.
There are precisely three receipts in the Chancellor's Additional Cost Allowance claim; contrast that with Gordon Brown's which is a veritable receipt mountain.
And now, I know, you are dying to see the PM's chart. Well, here it is. It's a bit like Harriet Harman's. The figures are close to what is expected.
And Brown's chi-squared is 9.37 (meaning his figures follow the Benford's Law distribution).