Wednesday, January 26, 2011

A small bug in the ICCER C++ Code

I'm kicking myself for not spotting this (which happened because I didn't examine the code in detail to check the outputs), but there's an anomaly in the code released by the The Independent Climate Change Email Review.

It has been pointed out by Nick Barnes in a comment where he says: A colleague points out that the code doesn't appear to weight cells by area. And, he (or the colleague) is correct.

Here's a snippet from the Met Office code for making the time series from their data.
# Global average of field
sub area_average {

    my $Field = shift;

    # Calculate the area average
    my $W_tot_NH = 0;
    my $A_tot_NH = 0;
    my $W_tot_SH = 0;
    my $A_tot_SH = 0;
    for ( my $i = 0 ; $i < $Field->{rows} ; $i++ ) {

        #Grid-box area is proportional to cosine of latitude
        my $Lat = 90 - ( $i + 0.5 ) * 180 / $Field->{rows};
        if ( $Lat > 90 || $Lat < -90 ) { next; }
        my $Weight = cos( $Lat * 3.141592 / 180 );
        for ( my $j = 0 ; $j < $Field->{columns} ; $j++ ) {
            if ( $Field->{data}->[$i][$j] == $Field->{missing} ) { next; }
            if ( $Lat > 0 ) {
                $W_tot_NH += $Weight;
                $A_tot_NH += $Field->{data}->[$i][$j] * $Weight;
            }
            else {
                $W_tot_SH += $Weight;
                $A_tot_SH += $Field->{data}->[$i][$j] * $Weight;
            }

        }
    }

#if northern hemisphere or southern hemisphere average is undefined return nothing
    unless ( $W_tot_NH > 0 && $W_tot_SH > 0 ) { return; }

    #otherwise return the mean of the northern and southern hemisphere averages
    return ( $A_tot_NH / $W_tot_NH + $A_tot_SH / $W_tot_SH ) / 2;
}
And here's a snippet from my code to do the same thing:
# ----------------------------------------------------------------------------
# 4. Use the gridded data to calculate northern and southern
# hemisphere average temperature anomaly.  This is done by averaging
# the anomalies over all the grid squares in the two hemispheres on a
# monthly basis to produce a single figure for each.  The averaging is
# weighted by the cosine of the latitude of the midpoint of the grid.
# ----------------------------------------------------------------------------

# Both these hashes have the same shape.  They contain a total and
# count of anomalies from all the grid squares in their hemisphere for
# the specific month and year.
#
# e.g. $north{1899}{0}{count}
#      $north{1899}{0}{total}

my %north;
my %south;

my $pi = 3.141592;

foreach my $year (sort keys %grid) {
  foreach my $month (sort { $a <=> $b } keys %{$grid{$year}}) {
    foreach my $long (sort keys %{$grid{$year}{$month}}) {
      foreach my $lat (sort keys %{$grid{$year}{$month}{$long}}) {
 my $weight = cos(($lat+$grid_size/2)*$pi/180); # Perl's cos() needs radians
 if ( $lat >= 0 ) {
   $north{$year}{$month}{count} += 1;
   $north{$year}{$month}{total} +=
     $weight * $grid{$year}{$month}{$long}{$lat}{total} /
       $grid{$year}{$month}{$long}{$lat}{count};
   $north{$year}{$month}{weight} += $weight;
 } else {
   $south{$year}{$month}{count} += 1;
   $south{$year}{$month}{total} +=
     $weight * $grid{$year}{$month}{$long}{$lat}{total} /
       $grid{$year}{$month}{$long}{$lat}{count};
   $south{$year}{$month}{weight} += $weight;
 }
      }
    }
  }
}
In both those bits of code the anomaly average is weighted based on the cosine of the latitude of the grid square (to take into account the different area of the grid boxes at different latitudes). If you don't do this then latitudes near the poles would have a greater effect on the overall result than they should. This averaging is explained in Hemispheric and Large-Scale Surface Air Temperature Variations: An Extensive Revision and an Update to 2001 (Jones and Moberg, 2003). Unfortunately, the ICCER C++ code does not do this, it takes a simple average, so the latitudes near the poles will be overemphasized.
//Now we have the grid, average over cells
  for( int iy=0; iy < nyears; iy++ )
  {
   anomaly[iy] = 0 ;
   int ncell=0 ;
   for( int ilat=0; ilat < nlat ; ilat++ )
   {
    for( int ilon=0; ilon < nlon; ilon++ )
    {
     if( grid[iy][ilat][ilon] > -9990 ) {
      anomaly[iy] += grid[iy][ilat][ilon];
      ncell++ ;
     }
     else {
     }
    }
   }
[snip]
  }
It's quite easy for this sort of thing to happen. When I read through Brohan et al. 2006 I had to really dig in and follow references to make sure that I was doing precisely what was being described. Probably no surprise that the ICCER guys made this error, you'd have to read Jones and Moberg 2003 to see it. And now you probably want to know what difference that makes. Well here's a chart showing the two trends.
The warming trend shape doesn't change, but the temperature anomaly does alter. In recent years the unweighed average is greater than the weighted. For example, grabbing 1998 to 2008 at random the differences ranges between 0.02C and 0.07C with an average of 0.06C. So the upshot of the ICCER bug is that it makes things seem slightly warming.

2 comments:

Nick Barnes said...

Yes, David Jones gets the credit for spotting this. We will probably post something at the Climate Code Foundation blog.

Nick Barnes said...

Finally wrote this up at the Foundation, in a post trying to make some broader points about releasing code.