As I've worked through Uncertainty estimates in regional and global observed temperature changes: a new dataset from 1850 to reproduce the work done by the Met Office I've come up against something I don't understand. I've written to the Met Office about it, but until I get a reply this blog post is to ask for opinions from any of my dear readers.
In section 6.1 Brohan et al. talk about the problem of coverage bias. If you read this blog post you'll see that in the 1800s there weren't many temperature stations operating and so only a small fraction of the Earth's surface was being observed. There was a very big jump in the number of stations operating in the 1950s.
That means that when using data to estimate the global (or hemispheric) temperature anomaly you need to take into account some error based on how well a small number of stations act as a proxy for the actual temperature over the whole globe. I'm calling this the coverage bias.
To estimate that Brohan et al. use the NCEP/NCAR 40-Year Reanalysis Project data to get an estimate of the error for the groups of stations operating in any year. Using that data it's possible on a year by year basis to calculate the mean error caused by limited coverage and its standard deviation (assuming a normal distribution).
I've now done the same analysis and I have two problems:
1. I get much wider error range for the 1800s than is seen in the paper.
2. I don't understand why the mean error isn't taken into account.
Note that in the rest of this entry I am using smoothed data as described by the Met Office here. I am applying the same 21 point filter to the data to smooth it. My data starts at 1860 because the first 10 years are being used to 'prime' the filter. I extend the data as described on that page.
First here's the smooth trend line for the northern hemisphere temperature anomaly derived from the Met Office data as I have done in other blog posts and without taking into account the coverage bias.
And here's the chart showing the number of stations reporting temperatures by year (again this is smoothed using the same process).
Just looking at that chart you can see that there were very few stations reporting temperature in the mid-1800s and so you'd expect a large error when trying to extrapolate to the entire northern hemisphere.
This chart shows the number of stations by year (as in the previous chart), it's the green line, and then the mean error because of the coverage bias (red line). For example, in 1860 the coverage bias error is just under 0.4C (meaning that if you use the 1860 stations to get to the northern hemisphere anomaly you'll be too hot by about 0.4C. You can see that as the number of stations increases and global coverage improves the error drops.
And more interesting still is the coverage bias error with error bars showing one standard deviation. As you might expect the error is much greater when there are fewer stations and settles down as the number increases. With lots of stations you get a mean error near 0 with very little variation: i.e. it's a good sample.
Now, to put all this together I take the mean coverage bias error for each year and use it to adjust the values from the Met Office data. This causes a small downward change which emphasizes that warming appears to have started around 1900. The adjusted data is the green line.
Now if you plot just the adjusted data but put back in the error bars (and this time the error bars are 1.96 standard deviations since the published literature uses a 95% confidence) you get the following picture:
And now I'm worried because something's wrong, or at least something's different.
1. The published paper on HadCRUT3 doesn't show error bars anything like this for the 1800s. In fact the picture (below) shows almost no difference in the error range (green area) when the coverage is very, very small.
2. The paper doesn't talk about adjusting using the mean.
So I think there are two possibilities:
A. There's an error in the paper and I've managed to find it. I consider this a remote possibility and I'd be astonished if I'm actually right and the peer reviewed paper is wrong.
B. There's something wrong in my program in calculating the error range from the sub-sampling data.
If I am right and the paper is wrong there's a scary conclusion... take a look at the error bars for 1860 and scan your eyes right to the present day. The current temperature is within the error range for 1860 making it difficult to say that we know that it's hotter today than 150 years ago. The trend is clearly upwards but the limited coverage appears to say that we can't be sure.
So, dear readers, is there someone else out there who can double check my work? Go do the sub-sampling yourself and see if you can reproduce the published data. Read the paper and tell me the error of my ways.
UPDATE It suddenly occurred to me that the adjustment that they are probably using isn't the standard deviation but the standard error. I'll need to rerun the numbers to see what the shape looks like, but it should reduce the error bounds a lot.
UPDATE Here's what the last graph looks like if I swap out the standard deviation for the standard error.
That's more like it, I'm going to guess that this what Brohan et al. are doing (without saying it explicitly). But that doesn't explain why their error seems to remain constant. Anyone help with that?
UPDATE The Met Office has replied to my email with an explanation of what's going on with the mean and standard deviation and I'll post it shortly.
UPDATE Please read this post which shows that my code contained an error in interpreting longitude which results in a chart that looks like the one from the Met Office.