Sunday, 29 November 2009

CRU 'Climategate' Code Analysis

I'm sure you've all seen the news articles and blog posts about the 'hacked' CRU data, but no one's actually explained what these people have done. This is the first 'code in context' post, designed to thoroughly and accurately show exactly what manipulation the CRU 'scientists' (in the loosest sense of the word) have performed.

Also, unlike the CRU and NASA scientists, I will publish my source and methods so that you can recreate the results if you want to.

First off... all the files can be downloaded from the Ironic Surrealism blog here.

Secondly, if you're on a linux/mac system, you'll already have 'grep' available on the command line (if you don't know what this is, grep allows you to search for certain strings inside multiple files). If you're on Windows, then download Cygwin or grep for windows.

At the command line, go to the directory with the CRU folder, for me this was

cd Downloads/FOIA

Then the command to use is

find . -type f -exec grep -iH 'decline' {} \;

This will search every file inside every folder in the FOIA folder, and print out any file path and text which contains 'decline'. You can change the text inside the quotation marks to search for whatever you want - my personal recommendations are 'fudge', 'artificial', 'correction', 'delete' and 'crap'.

Let's start with 'correction' - one of the files that pops up is

./harris-tree/briffa_sep98_e.pro

with the example line

;****** APPLIES A VERY ARTIFICIAL CORRECTION FOR DECLINE*********

This is a good place to start - in the context of a research center for 'global warming', this comment requires a bit of clarification!...

Here's the first 12 lines from harris-tree/briffa_sep98_e.pro:

;
; PLOTS 'ALL' REGION MXD timeseries from age banded and from hugershoff
; standardised datasets.
; Reads Harry's regional timeseries and outputs the 1600-1992 portion
; with missing values set appropriately. Uses mxd, and just the
; "all band" timeseries
;****** APPLIES A VERY ARTIFICIAL CORRECTION FOR DECLINE*********
;
yrloc=[1400,findgen(19)*5.+1904]
valadj=[0.,0.,0.,0.,0.,-0.1,-0.25,-0.3,0.,-0.1,0.3,0.8,1.2,1.7,2.5,2.6,2.6,$
2.6,2.6,2.6]*0.75 ; fudge factor
if n_elements(yrloc) ne n_elements(valadj) then message,'Oooops!'


This could be worth following up! The variable valadj is assigned an array (or 'vector' in IDL language) of hard-coded 'corrections', which are then multiplied by 0.75, followed by the comment '; fudge factor'.

That's pretty damning evidence, but they haven't actually used it for anything here, so let's see where valadj is used in the code...

./harris-tree/briffa_sep98_e.pro, lines 54-58
;
; APPLY ARTIFICIAL CORRECTION
;
yearlyadj=interpol(valadj,yrloc,x)
densall=densall+yearlyadj


Here we go! Every time the program is run, an interpolation is performed on the hard-coded 'corrections', which is then added to the computed data. Fraud! This is unambiguous proof that CRU have been manually covering up the real temperature values to suit their own agenda! Exactly the same code can be found performing the same duties in lines 82-86 in the same file.

Also worth noting is that despite the code being fairly well documented, there are no comments saying 'this is legitimate' etc. This is undeniable fraud.

Next up....




Another file that crops up is ./osborn-tree6/summer_modes/pl_decline.pro,
with the comment

Now fit a 2nd degree polynomial to the decline series, and then extend

This is not incriminating in itself, but is worth investigating further, here's the full snippet, lines 199-209:

;
; Now fit a 2nd degree polynomial to the decline series, and then extend
; it at a constant level back to 1400. In fact we compute its mean over
; 1856-1930 and use this as the constant level from 1400 to 1930. The
; polynomial is fitted over 1930-1994, forced to have the constant value
; in 1930.
;
if matchvar eq 0 then ycon=1900 else ycon=1930 ; normally 1930
declinets=fltarr(mxdnyr)*!values.f_nan
allx=timey(ktem)
ally=difflow


Now we get to another fudge, but this one's hidden slightly better! Whoever wrote this is averaging the temperatures from a recent period (around 1900), and using that average for the temperature values since 1400!

This programmer is setting all of the old temperature data, which scientists have spent years collecting from tree rings/ice cores, to a constant value!

This seems very strange, however let's go back a little bit. The CRU data was the main data source that the IPCC used in it's reports over the last 10-15 years. Does anyone remember the 'hockey-stick curve'?

(Re-printed below for completeness - this was stolen from the illconsidered blog)



Notice anything suspect?
The 'linear trend' appears to go upto the year 1900, followed by an apparent massive increase in the average temperature.

Let's recap. The first part of this post showed that the modern data was being artificially increased by constant values, and the second part showed that the long-term historical data was being 'flattened'.

So, add a little bit of random noise, and there's the hockey stick curve created, with complete disregard for scientific method.

But... we haven't talked about the code in the 'flattening' case (in file ./osborn-tree6/summer_modes/pl_decline.pro). The variable declinets is created in the code snippet above, which is initialised to an array of value 0.0 (this is what fltarr() does). declinets then has a few operations performed on it (you can check them out for yourself), before being involved in this spectacular piece of code:

;
; Now apply a completely artificial adjustment for the decline
; (only where coefficient is positive!)
;
tfac=declinets-cval
fdcorrect=fdcalib
for iyr = 0 , mxdnyr-1 do begin
fdcorrect(*,*,iyr)=fdcorrect(*,*,iyr)-tfac(iyr)*(zcoeff(*,*) > 0.)
endfor


So in essence, take the decline series, subtract cval (cval=total(ally(kconst))/float(nconst) :: line 212), and call it tfac. Then subtract tfac times the z co-efficients from the fdcorrect data, but (as the comment says), only where the co-efficient is positive!

It's difficult to go further, as we don't have any of the real values that were input here, however we can assume (since this is the 'adjustment for the decline') that one of the input arrays is negative, therefore the resulting fdcorrect array is positive.

It's amazing that they do this despite being trained scientists! It's worth pointing out that the first code snippets were from 'harris-tree', and the second from 'osborn-tree'. After a quick search of the UEA website, we have the candidates Tim Osborn of the CRU, and Mr. Ian (Harry) Harris of the CRU, who even quotes 'data manipulation' as one of his research areas!



I could go on today, but there's enough detail here for you to make your own investigations. Stay tuned, I'll try and figure some more stuff out in between Call of Duty 2 sessions!

Cheers
Clive E