Sunday, October 31, 2010

The ICCER C++ Code

Early in September I received a message from Sir Muir Russell indicating that he and the Independent Climate Change Email Review folks were going to look into my request to release the C++ code they used to analyze GHCN records as part of the review.

This morning I received another email from Sir Muir indicating that the code had been released and that it was available from the review web site. Sure enough it appears that the code was released last week.

Today I downloaded it and got it running without much difficulty. The code is written in C++ and compiles easily with g++. The review didn't release a Makefile or similar build instructions so I quickly hacked one together to make building easy:

# Trivial Makefile to build a program called 'ghcn' from the code
# released by the ICCER for gridding/trending GHCN data

# The name of the program that will be generated
PROG := ghcn

.PHONY: all clean
all: $(PROG)
clean: ; @rm -f $(PROG)

HDRS := $(wildcard inc/*.h)

$(PROG): Analysis.cxx $(HDRS) ; @g++ -Wall -o $@ $<

Doing a make creates a program called ghcn that reads GHCN files (v2) and calculates anomalies, performs gridding and calculates the global time series.

There's a helpful README file that comes with the code which contains an important caveat: The exploratory nature of this code is such that it does not follow design rules or optimisation as would be appropriate for production code and there is no comprehensive exception handling.

And, sure enough, the code is a bit messy and doesn't follow some good C++ practices. The comments present in the code tend to be poor, there are entire classes implemented inside .h files (which is mentioned with the cryptic comment The code is purposely not factorised into normal .cxx and .h files), there are a few #define's where a const would have been better. And there's an almost random handling of the special -9999 code for missing data where sometimes the code decides to check for -9990, -9999 or -1000 seemingly on the whim of the coder.

But, having said that, reading it it wouldn't be a big step to go from it to something quite robust. The first step would be to write tests for the code so that its operation could be verified at a functional level. Note that I've compiled with maximum warnings and none a produced.

The only thing I saw that was obviously anomalous was that the code uses fifteen values in the date range 1961 to 1990 for normal calculation, whereas the ICCER report said they used ten values from 1955 to 1995. Note that that'd make much difference.

The program fills an array with the yearly temperature trend. Here's what that looks like when charted:

5 comments:

sleepalot said...

So, what do you think of the results?

(And what does the graph purport to show?)

John Graham-Cumming said...

I think the code does what it is supposed to do.

The chart shows a temperature trend based on a 1961-1990 average baseline of increasing temperatures in recent years.

Nick Barnes said...

I must have been asleep in the week this came out. FWIW, my views of this code concur with yours.

Nick Barnes said...

Hmm. A colleague points out that the code doesn't appear to weight cells by area. Oops. Of course, you get much the same chart if you do.

John Graham-Cumming said...

Thanks, Nick. I wrote up the bug as a blog post.

http://blog.jgc.org/2011/01/small-bug-in-iccer-c-code.html