James' Empty Blog: ipcc

Showing posts with label ipcc. Show all posts

Thursday, October 11, 2018

That IPCC thing

Hello faithful blog readers. After a long absence I'm going to try to post some things again. Got a small backlog of ideas to write about and a little bit of free time to work at them.

I'll start with the tedious 1.5C nonsense since I can sense that you are all (both?) desperate for my opinions. The simple and efficient way for the IPCC to have responded to the stupid proposal from politicians that, having made no progress towards limiting global warming to 2C, they would instead really really try really really hard to limit it to 1.5C honest guv we really really mean it this time, would have been to say "No you won't, you are just pretending. Now go away and do something constructive rather than delaying and passing the buck". That would have saved a lot of money and CO2 emissions. But instead, we get lots of meetings, papers, scientists pontificating on the radio and ridiculous misrepresentations(*) like the Grauniad saying we have 12 years to save the planet. Back in 2008 we had less than a decade, so that's a step in the right direction.

What. A. Waste. Of. Time. And. Effort.

Perhaps Douglas Adams put it best when he said

I love deadlines. I love the whooshing noise they make as they go by."

(*) misrepresentation of the reality. I really don't much care whether it's the IPCC's fault for inviting this sort of interpretation, or the Grauniad for being clueless, or someone in the middle causing confusion. It's a gift to denialists either way.

Saturday, March 08, 2014

"Pause" blah

I've got lots of bits and pieces to write about, so this will probably turn out to be a fairly incoherent blog post as I don't have time to write a concise and structured one. We are still struggling to obtain a proper broadband connection, though the end (one way or the other) might be in sight with a BT engineer visit planned for next week.

We had a brief visit to Bristol last week - sorry to those we had promised to contact, but it was a chance to see two different people on the same rather busy day, so we jumped at it. We are now both officially "visiting collaborators" of some sort there which is nice, the main practical benefit is library access (inc remotely), and perhaps also the right to a cup of coffee in the staff common room? So hopefully we'll be back on an occasional basis in the future. This is all as a follow-up to Paul Valdes' sabbatical visit to Japan last year, from which joint work is still ongoing.

The "pause" discussion continues (see RC for a summary of recent coverage), which seems a bit silly to me, because it isn't really a "pause" at all, just a continued anthropogenically-forced warming with some other (anthropogenic and natural) forcings and internal variability added on, such that the trend is a little lower than most expected. Of course idiots will continue to play the "down the up escalator" game indefinitely, but I don't feel the need to play with them. I'm usually happy to let the "communicators" duke it out on the most politically correct way to present the science, but perhaps they could start by not using a term that's factually wrong.

There are many possible causes for the model-data discrepancy: the forcings might have been more negative than anticipated, or perhaps natural variability has a bit more negative recently, and just possibly the forced response is a little lower than (most) models predicted. I'm a little surprised to see people like Gavin apparently nailing his colours to the mast of the models being right, for one thing, his calculations (which may be mildly optimistic) only explain "most" of the model-data discrepancy, and it is worth noting that since the natural forcings and internal variability are relatively transient and short-term in nature, this view implies a substantial future near-term acceleration in order for the world to catch up with where the models say we should be. All model simulations show only a very gradual rise in underlying trend, and it is worth mentioning (to those who think that climate scientists have been slow to discuss this) that back in about 2006 I was pointing out to the authors of the IPCC AR4 drafts that the model trends were already starting to look a little high relative to recent observations. This acceleration has been promised "just around the corner" for a long time now, I'm happy to give people like Hansen a bit of a pass on his 1984 work because it was so groundbreaking (and substantially correct), but it's now starting to feel like people are scrabbling around trying to find excuses. I even think I saw some wag in a recent paper (sorry I forgot where) arguing that there were so many excuses for a lack of warming, that the logical conclusion from the model-data discrepancy was that sensitivity was actually higher than the models say!

By the way, one point that is sometimes ignored in these recent energy balance type of calculations, is that some of the analyses (ie, those based on D&A techniques) aim to specifically separate out the different forcings though the different warming patterns they generate. So it is not enough to claim that there are additional negative forcings, but these forcings actually have to generate a spatial warming pattern that negates the well-known pattern of GHG response. Or else, the model patterns of response to the different forcings have to be wrong in a way that leads to a large systematic underestimation of the GHG impact. It's not impossible, but at some point Occam's razor has to kick in.

Oh yes, the GWPF thing has also been published. Disclaimer: those who actually read the thing (which seems to be a small minority, judging from my in-box) will see that I'm acknowledged, which is due to having acted as a reviewer. I haven't carefully checked the final version, but on the whole I saw it as a slightly optimistic but basically defensible interpretation of the evidence. There have certainly been worse papers published on climate sensitivity! I haven't seen any very convincing rebuttals, but am of course open to further dicussion on that score. I'm sort of assuming it's been on the blogosphere but haven't had much time (or internet) to look. Some have found it notable that the GWPF is explicitly acknowledging a significant future warming (albeit at the low end of IPCC projections) thanks to ongoing emissions. I'm not sure how much of a narrowing of disagreement that represents, especially when others are doubling down on a high sensitivity.

BTW, for those who accuse me of being in the pay of the fossil fuel industry - yes, the trip to Bristol was funded by a major oil company :-)

Having strayed into tl;dr territory (some time ago!) I'd better stop here.

Wednesday, September 25, 2013

First!

For various reasons mostly related to the IPCC rumour-mill, the "hiatus" (seems to be the politically correct term these days) in global temperature is in the news again. Which brings to mind this manuscript which was rejected by GRL a few years ago (and which I just put on the arXiv a few days ago):

Our results indicate cause for concern regarding the consistency between climate model projections and observed climate behavior under conditions of increasing anthropogenic greenhouse-gas emissions.

The analysis was extensively discussed back at the time, and the paper submitted to (and rejected by) GRL at about the same time. From memory, it got quite an involved treatment from the reviewers. Rejection from GRL isn't something I can't get too worked up over. I'm confident that the paper was fundamentally correct, worth publishing, and that it would have had plenty of impact. However, the peer review filter is pretty noisy at journals like GRL with high rejection rates, and decisions can't be parsed too finely. I did subsequently encourage submission to other journals, but for various reasons that didn't happen. Of course it's easy for a minor author to encourage other people to do the work for a new submission :-) In case it isn't already clear, my listing as last author is not an indication that I'm the Machiavellian brains masterminding this nefarious plot to discredit climate models, but instead just a fair reflection on the minor magnitude of my contribution.

3 years later, it seems reasonable to conclude that our main error was merely in being several years ahead of the rest of the field.

Monday, May 20, 2013

More on that recent sensitivity paper

Now I'm embarrassed at my naivety...it is all as clear as day. The story goes as follows:

Way back in the mists of time (well, about 2011 or so) the IPCC authors agreed that the "likely" value for the equilibrium climate sensitivity was 2-4.5C. They then wrote the first draft to match, which was easy enough as they seemed to be unaware of most of the recent literature on the matter, and could easily brush off the few papers they did know about (like ours) as outliers.

Inconveniently for them, the observations of the planetary energy balance are actually incompatible with their preferred choice, and as well as some reviewers telling them about the papers that had already appeared, more papers continued to be written - too many to be just ignored this time. So that left them with a bit of a credibility gap.

The brilliant solution they have come up with is to write a paper on the planetary energy balance, which in numerical terms of course basically confirms what all the recent papers have said, but describe this with the phrasing that their result "is in agreement with earlier estimates, within the limits of uncertainty." (Here, "earlier" clearly refers to papers which do not use the last decade of data, ie those up to around AR4 time). Thus, this paper can be cited as support for the 2-4.5C "likely" range! They've even got one of their loudest critics, Nic Lewis, to agree with this!

Whoever came up with that wording certainly deserves a Nobel prize...for chutzpah. I suspect that Nic may regret putting his name to it, although he could argue - with some justification - that the numerical results should outweigh the verbal gymnastics.

Note by the way that it's not just the recent decade of data that points to a more moderate sensitivity estimate. For example, back in 2000, Forest et al generated an 90% range of 1.3-4.2C, when they used an expert prior - but at that time, the IPCC experts had all decided that a uniform prior was the correct approach.

Sunday, May 19, 2013

A chink of light at the end of the tunnel?

At last the great and the good have spoken. There's a new article in Nature Geoscience (here, but the bulk of the details are in the SI which seems to not be paywalled) concerning the energy balance of the climate system, which basically confirms what had already been presented in the slew of recent papers pointing to a lowering of climate sensitivity estimates.

The analysis itself is not particularly novel or exciting: what makes it newsworthy in my view is the list of authors, which includes some who had previously been trying to talk down these recent estimates (e.g. Knutti: "my personal view is that the overall assessment hasn’t changed much"). Even though this paper is too late for the IPCC AR5, I hope it reflects a change in thinking from the IPCC authors involved. (Notable also that Nic Lewis is involved.)

The results are described in rather strange terms, considering what they have actually presented. They argue that the new result for sensitivity "is in agreement with earlier estimates, within the limits of uncertainty". But of course none of the published estimates are inconsistent with each other in the sense of having non-overlapping uncertainty ranges - no-one credible has excluded a value of about 2.5C, that I am aware of. The contrasting claim that the analysis of transient response gives a qualitatively different outcome (being somewhat lower than both the previous IPCC assessment, and the range obtained from GCMs) is just weird, since both their ECS and TCR results are markedly lower than the IPCC and GCM ranges.

This looks like a pretty unreasonable attempt to spin the result as nothing new for sensitivity, when it is clearly something very new indeed from these authors, and implies a marked lowering of the IPCC "likely" range. Although the paper does not explicitly mention it, the "likely" range for equilibrium climate sensitivity using the full 40y of data seems to be about 1.3-3C (reading off the graph by eye, the lower end may be off a bit due to the nonlinear scale). So although the analysis does depend on a few approximations and simplifications, it's hard to see how they could continue to defend the 2-4.5C range.

Update: post by Nic Lewis here, also coverage in NewScientist.

Friday, February 01, 2013

A sensitive matter

So, sensitivity has been in the climate blogosphere a bit recently. Just a few days ago, that odd Norwegian press release got some people excited, but it's not clear what it really means. There is an Aldrin et al paper, published some time ago - which gave a decent constraint on climate sensitivity, though nothing particularly surprising or interesting IMO. We thought we had sorted out the sensitivity kerfuffle several years ago, but it seems that the rest of the world still hasn't yet caught up. As I said to Andy Revkin (and he published on his blog), the additional decade of temperature data from 2000 onwards (even the AR4 estimates typically ignored the post-2000 years) can only work to reduce estimates of sensitivity, and that's before we even consider the reduction in estimates of negative aerosol forcing, and additional forcing from black carbon (the latter being very new, is not included in any calculations AIUI). It's increasingly difficult to reconcile a high climate sensitivity (say over 4C) with the observational evidence for the planetary energy balance over the industrial era. But the Norwegian press release seems to refer to as yet unpublished research, and some of the claims seem a bit hard to credit. So we will have to wait for more details before drawing any more solid conclusions.

Before then, there was the minor blogstorm (at least in some quarters) surrounding Nic Lewis' criticism of the IPCC's stubborn adherence to their old estimate of climate sensitivity. This, of course, being despite the additional evidence which I've just mentioned above.

When I looked at the IPCC drafts, I didn't actually notice the substantial change in estimated aerosol uncertainty that Nic focussed on. With limited time and energy to wade through several hundred pages of draft material, I mostly looked for how and where they had (or had not, but perhaps should have) referred to my work, to make sure it was fairly and accurately represented. I was pretty unimpressed with some parts of first draft, actually, and made a number suggestions. Of course in line with the IPCC conditions, I'm not going to say what was or was not in any draft. According to IPCC policy, my comments will all be available in the fullness of time, but I have also criticised this delayed release so in the spirit of openness here is one comment I made about their discussion of sensitivity in Chapter 12 (p55 in the first order draft):

It seems very odd to portray our work as an outlier here. Sokolov et al 2009, Urban and Keller 2010, Olson et al (in press JGR) have also recently presented similar results (and there may be more as yet unpublished, eg Aldrin at the INI meeting back in 2010). Such "observationally constrained pdfs" were all the rage a few years ago and featured heavily in the last IPCC report, there is no clear explanation for your sudden dismissal of them in favour of what seems to be a small private opinion poll. A more balanced presentation could be: "Annan and Hargreaves (2011a) criticize the use of uniform priors and argue that sensitivities above 4.5°C are extremely unlikely (less than 5%). Similar results have been obtained by a number of other researchers [add citations from the above]."

Note for the avoidance of any doubt I am not quoting directly from the unquotable IPCC draft, but only repeating my own comment on it. However, those who have read the second draft of Chapter 12 will realise why I previously said I thought the report was improved :-) Of course there is no guarantee as to what will remain in the final report, which for all the talk of extensive reviews, is not even seen by the proletariat, let alone opened to their comments, prior to its final publication. The paper I refer to as a "small private opinion poll" is of course the Zickfeld et al PNAS paper. The list of pollees in the Zickfeld paper are largely the self-same people responsible for the largely bogus analyses that I've criticised over recent years, and which even if they were valid then, are certainly outdated now. Interestingly, one of them stated quite openly in a meeting I attended a few years ago that he deliberately lied in these sort of elicitation exercises (i.e. exaggerating the probability of high sensitivity) in order to help motivate political action. Of course, there may be others who lie in the other direction, which is why it seems bizarre that the IPCC appeared to rely so heavily on this paper to justify their choice, rather than relying on published quantitative analyses of observational data. Since the IPCC can no longer defend their old analyses in any meaningful manner, it seems they have to resort to an unsupported "this is what we think, because we asked our pals". It's essentially the Lindzen strategy in reverse: having firmly wedded themselves to their politically convenient long tail of high values, their response to new evidence is little more than sticking their fingers in their ears and singing "la la la I can't hear you".

Of course, this still leaves open the question of what the new evidence actually does mean for climate sensitivity. I have mentioned above several analyses that are fairly up to date. I have some doubts about Nic Lewis' analysis, as I think some of his choices are dubious and will have acted to underestimate the true sensitivity somewhat. For example, his choice of ocean heat uptake is based on taking a short term trend over a period in which the observed warming is markedly lower than the longer-term multidecadal value. I don't think this is necessarily a deliberate cherry-pick, any more than previous analyses running up to the year 2000 were (the last decade is a natural enough choice to have made) but it does have unfortunate consequences. Irrespective of what one thinks about aerosol forcing, it would be hard to argue that the rate of net forcing increase and/or over-all radiative imbalance has actually dropped markedly in recent years, so any change in net heat uptake can only be reasonably attributed to a bit of natural variability or observational uncertainty. Lewis has also adjusted the aerosol forcing according to his opinion of which values are preferred - concidentally, he comes down on the side of an answer that gives a lower sensitivity. His results might be more reasonable if he had at least explored the sensitivity of his result to the assumptions made. Using the last 30y of ocean heat data and simply adopting the official IPCC forcing values rather than his modified versions (since after all, his main point is to criticise the lack of coherence in the IPCC report itself) would add credibility to his analysis. A still better approach would be to use a model capable of representing the transient change, and fitting it to the entire time series of the various relevant observations. Which is what people like Aldrin et al have done, of course, and which is why I think their results are superior.

But the point stands, that the IPCC's sensitivity estimate cannot readily be reconciled with forcing estimates and observational data. All the recent literature that approaches the question from this angle comes up with similar answers, including the papers I mentioned above. By failing to meet this problem head-on, the IPCC authors now find themselves in a bit of a pickle. I expect them to brazen it out, on the grounds that they are the experts and are quite capable of squaring the circle before breakfast if need be. But in doing so, they risk being seen as not so much summarising scientific progress, but obstructing it.

There's a nice example of this in Reto Knutti's comment featured by Revkin. While he starts out be agreeing that estimates based on the energy balance have to be coming down, he then goes on to argue that now (after a decade or more of generating and using them) he doesn't trust the calculations because these Bayesian estimates are all too sensitive to the prior choices. That seems to me to be precisely contradicted by all the available literature, which demonstrates that so long as absurd priors are avoided, the results are actually remarkably robust. Our own Climatic Change paper, Salvador Pueyo, Aldrin and the other papers above all use a wide range of different priors based on a range of different arguments but still arrive at very similar answers (at least, similar enough in the context of the hypothetical "long tail" for the pdf of climate sensitivity)! It looks rather like the IPCC authors have invented this meme as some sort of talismanic mantra to defend themselves against having to actually deal with the recent literature.

Thursday, February 23, 2012

A(nother) climate sensitivity estimate using Bayesian fusion of instrumental observations and an Earth System model

The mode of the climate sensitivity estimate is 2.8C, with the corresponding 95% credible interval ranging from 1.8 to 4.9C.

Note that this is a 95% interval, meaning that the upper bound is at the 97.5% level rather than the commonly-quoted 95th percentile (for a 90% "very likely" range).

As I said previously, it will be interesting to see what approach the IPCC takes to this increasing number of "moderate" estimates appearing in the literature (bear in mind that a sensitivity of 3C still means a fair bit of climate change, but not as much as a sensitivity of 6C or 11C would...).

Thursday, August 05, 2010

"IPCC Experts" New Clothes

You may recall not so long I ago I blogged about our paper in which we argued that the (standard outside climate science) paradigm of a statistically indistinguishable ensemble - where reality is just another sample from the distribution - is a much more natural and plausible interpretation of the AR4 multi-model ensemble, than the alternative "truth-centred" paradigm - where the models are assumed to be scattered around with reality lying exactly at the centre of their sampling distribution. The latter has no theoretical basis or practical support as far as I can tell, it appears to have been plucked out of thin air by a process of wishful thinking, and is strongly refuted by an analysis of the ensemble. But this post isn't really about that.

Immediately after that paper was published, the IPCC held a closed meeting which we were of course not permitted to attend. The purpose of the meeting was to generate a "best practice guidance paper" for the use of the multi-model ensemble. Jules predicted that our work would get misinterpreted somehow, but I thought our paper was fairly straightforward and hard to misunderstand. Well, I hadn't reckoned on the unique skills of the "IPCC Experts". Eventually this meeting report and summary appeared on their web site.

Regarding the interpretation of the multi-model ensemble, they say:

Alternatively, a method may assume:

b. that each of the members is considered to be ‘exchangeable’ with the other members and with the real system (e.g., Murphy et al., 2007; Perkins et al., 2007; Jackson et al., 2008; Annan and Hargreaves, 2010). In this case, observations are viewed as a single random draw from an imagined distribution of the space of all possible but equally credible climate models and all possible outcomes of Earth’s chaotic processes.

What? What is "the space of all possible but equally credible climate models" and what does this have to do with anything? Of the papers they cite, only ours actually mentions exchangeability and statistical indistinguishability, and what we wrote is that this means that "the truth is drawn from the same distribution as the ensemble members, and thus no statistical test can reliably distinguish one from the other". We also cited Toth et al 2003 (good book by famous NWP people) who wrote equivalently "the ensemble members and the verifying observation are mutually independent realizations of the same probability distribution".

Note that there is no reference to the "space of all possible models". All that matters is that the sampling distributions of models and truth are the same.

This may appear at first to be a rather pedantic and minor complaint. However, it doesn't take long to realise that the "space of all possible models" is a "colourless green idea", that is, a syntactically valid but completely meaningless phrase. This isn't just my assertion, it is agreed by all the previous authors who have used this terminology! (If you wish to disagree, feel free to explain in the comments what a "possible model" is, and how it can be distinguished from an impossible one. Good luck with that.)

In fact as far as we can tell this phrase has only ever been used to denigrate the use of the multi-model ensemble. The argument goes, that in order to understand how to use this ensemble, we have to first understand the "space of all possible models" from which they are sampled. This phrase is meaningless, therefore the use of the ensemble is theoretically ill-founded. Supporting quotes are appended below - quotes which many attendees of the meeting were well aware of, because they wrote them. Well, we don't mind people writing gibberish in their own papers, but we object strongly to them linking such nonsense to our work. Our analysis does not depend in any way on this meaningless concept, and to claim that it does (with the corollary that our analysis is philosophically ill-founded) is a flat-out lie.

In fact the multi-model ensemble can be very naturally interpreted as sampling our collective uncertainties about how best to represent the climate system. The question of reliability of the ensemble then simply amounts to asking whether these uncertainties are well-calibrated or not - which as we have shown, is an eminently testable hypothesis (at least in respect of current and historical data) and does not require anyone to "imagine" such bizarre and spurious constructions as the "space of all possible models".

We complained to the authors of this piece of nonsense, and they replied with the remarkable claim that despite being listed as the authors, they were not in fact responsible for the accuracy of anything they wrote, as they were merely reporting the "the definition as determined and agreed by the attendees", and would not countenance any correction of this mistake. Yes, they really used those words I have placed in quotes. Apparently it didn't occur to any of these "experts" present that this concept of statistical indistinguishability was an established term of art that already had a perfectly adequate definition, and that this existing definition is the only one that has ever been presented in the context of climate science. Their decision to reinvent the definition of statistical indistinguishability apparently has the full support of the IPCC hierarchy. I'm utterly gobsmacked that they place their duty to defend this "consensus" of a private clique above their duty to ensure that this "consensus" is honest, accurate, and useful to potential readers, let alone providing a fair representation of the work of those who are prohibited from participation in this process. It's as if the WG2 authors had simply proclaimed that 2035 was the date the experts had agreed that all Himalayan glaciers would vanish, and that was the end of the matter.

We have various manuscripts at different stages of writing and review, and can probably correct this mistake somehow (assuming that reviewers allow us to dissent from the newly-established "consensus"), but it's unlikely that what we write will ever have the circulation and influence that the IPCC bully pulpit affords. And of course, it is pretty hard to proof our work against spurious criticism when these "experts" are prepared to simply pluck arbitrary nonsense out of thin air. It's a shame that no-one there actually stood up and said "But these words have no meaning, how can they be used in a definition?"

Some references to the "space of all possible models", which make the nonsensical nature of this phrase clear, and how it has been used to argue against the use of the multi-model ensemble:

Allen et al 2002:

"the distribution of all possible models is undefined"

Collins 2007:

"Is the collection of the world’s climate models an adequate sample of the space of all possible models (and, indeed, is it even possible to define such a space)?"

Murphy et al 2007:

"Specifically, it is not clear how to define a space of possible model configurations of which the MME members are a sample. This creates the need to make substantial assumptions in order to obtain probabilistic predictions from their results"

Stainforth et al 2007:

"The lack of any ability to produce useful model weights, and to even define the space of possible models, rules out the possibility of producing meaningful PDFs for future climate based simply on combining the results from multi-model or perturbed physics ensembles; or emulators thereof."

Friday, January 15, 2010

Reliability of the IPCC AR4 (CMIP3) ensemble

So, our paper has now been now been accepted, and should be published in a week or two [update: here]. We think it poses a strong challenge to the "consensus" that has emerged in recent years.

If you are thinking this sounds like deja vu all over again, you'd be right. But the subject is a little different this time. Rather than estimates of climate sensitivity, this time we are talking about the interpretation of the "ensemble of opportunity" provided by the IPCC AR4 (formally CMIP3, but here I will use the popular name). Those who have been closely following this somewhat esoteric subject may have seen numerous assertions that the ensemble is likely biased, too narrow, doesn't cover an appropriate range of uncertainty etc etc. Thus, we should all be worried that there is a large probability that climate change may be even worse than the models imply.

Fortunately, it's all based on some analysis methods that are fundamentally flawed.

Although we'd been vaguely aware of this field for some time, the story really starts a couple of years ago at a workshop, when my attention was piqued by a slide which has subsequently been written up as part of a multi-author review paper and forms the motivation for Figure 1 in our paper. The slide presented an analysis of the multi-model ensemble, the main claim being that the multi-model ensemble mean did not converge sufficiently rapidly to the truth, as more models were added to it. Thus, the argument went, the models are not independent, their mean is biased, and we need to take some steps to correct for these problems when we try to interpret the ensemble.

The basic paradigm under which much of the ensemble analysis work in recent years has operated is based on the following superficially appealing logic: (1) all model builders are trying to simulate reality, (2) a priori, we don't know if their errors are positive or negative (with respect to any observables), (3) if we assume that the modellers are "independent", then the models should be scattered around in space with the truth lying at the ensemble mean. Like so:

where the truth is the red star and the models are the green dots.

However, this paradigm is completely implausible for a number of reasons. First, since we don't know the truth (in the widest sense) we have no possible way of generating models that scatter evenly about it. Second, this paradigm leads to absurd conclusions like a 90% "very likely" confidence interval for climate sensitivity of 2.7C - 3.4C, based on the sensitivities reported by the AR4 models (this comes from a simple combinatorial argument based on the number of models you expect to be higher and lower than the truth, if they lie independently and equiprobably on either side). Third, it implies that all we would need to do to get essentially perfect predictions is to build enough models and take the average, without any new theoretical insights or observations regarding the climate system.

Lastly, it is robustly refuted by simple analyses of the ensemble itself, as observations (of anything) are routinely found to lie some way from the ensemble mean. As has been demonstrated in several papers including the multi-author review paper mentioned above.

So you might think this paradigm should have been still-born and never caught on. However, people have persevered with it over a number of years, trying to fix it with various additional "bias" terms or ensemble inflation methods, and generally worrying that the ensemble isn't as good as they had hoped.

So along we came to have a look. Actually, although this issue had been sitting uneasily at the back of my mind for some time, we were finally prompted into looking into it properly earlier this year when Jules was asked to write something else concerning model evaluation.

It didn't take long to work out what was going on. As explained above, the truth-centred paradigm is theoretically implausible and observationally refuted. However, there is a much more widely-used (indeed all-but ubiquitous) way of interpreting ensembles, in which the ensemble members are assumed to be exchangeable with the truth, or statistically indistinguishable from it. So in contrast to the picture above, we might expect to see something like this:

Here the red isolines describe the distribution defined by the models. Note that the truth (red star) is not at the ensemble mean, but just some "typical" place in the ensemble range.

In contrast to the truth-centred paradigm, it is easy to understand how such an ensemble might arise - all we need to do is make a range of decisions when building models, that reflect our honestly-held (but uncertain) beliefs about how the climate system operates. So long as our uncertainty is commensurate with the actual errors of our models, there is no particular need to assume that our beliefs are unbiased in their mean, and indeed they will not be.

I can't emphasise too strongly that this is the basic paradigm under which pretty well all ensemble methods have always operated, apart from one small little corner of climate science. It underpins the standard probabilistic interpretation, that if a proportion p% of the ensemble has property X, we say the probability of X is p%. A corollary is that if we apply this interpretation to the climate sensitivity estimates, we find a "very likely" confidence interval of 2.1C - 4.4C. Now I'm sure some would argue that this interval is too narrow, but I would say it is pretty reasonable, though this is somewhat fortuitous as with such a small sample the endpoints are determined entirely by the outliers. The implied 70% confidence interval of 2.3C - 4.3C is more robust, and would be hard to criticise. What is certainly clear is that these ranges are not completely horrible in the way that the one provided by the truth-centred interpretation was.

With this statistically interchangeable paradigm being central to all sorts of ensemble methods, notably including numerical weather prediction, it is no surprise that there is a veritable cornucopia of analysis tools already available to investigate and validate such ensembles. The most basic property that most people are interested in is "reliability", which means that an event occurs on p% of the occasions that it has been predicted to occur with probability p%. This is the meaning of "reliability" used in the subject line of this post and title of our paper. A standard test of reliability is that the rank histogram of the observations in the ensemble is uniform. So this is what we tested, using basically the same observations that others had used to show that the ensemble was inadequate.

And what we found is....

...the rank histograms (of surface temperature, precipitation and sea level pressure from top to bottom) aren't quite uniform, but they are pretty good. The non-uniformity is statistically significant (click on the pic for bigger, and the numbers are explained in the paper), but the magnitude of the errors in mean and bias are actually rather small. What's more, the ensemble spread is if anything too broad (as indicated by the domed histograms), rather than too narrow as has been frequently argued.

So our conclusion is that all this worry about the spread of the ensemble being too small is actually a mirage caused by a misinterpretation of how ensembles normally behave. Of course, we haven't actually shown that the future predictions are good, merely that the available evidence gives us no particular cause for concern. Quite the converse, in fact - the models sample a wide range of physical behaviours and the truth is, as far as we can tell, towards the centre of their spread. This supports the simple "one member one vote" analysis as a pretty reasonable starting point, but also allows for further developments such as skill-based weighting.

This paper seems particularly timely with the IPCC having a "Expert Meeting on Assessing and Combining Multi-Model Climate Projections" in a couple of weeks. In fact it was partly hearing about that meeting that prompted us to finish off the paper quickly last November, although we had, as I mentioned, been thinking about it for some time before then. I should give due praise to GRL, since I've grumbled about them in the past. This time, the paper raced through the system taking about 3 weeks from submission to acceptance - it might have been even quicker but the GRL web-site was borked for part of that. It is nice when things happen according to theory :-) Not forgetting the helpful part played by the reviewers too, who made some minor suggestions and were very enthusiastic overall.

Unfortunately, hoi polloi like Jules and myself are not allowed to appear in such rarefied company as the IPCC Expert Meeting - I did ask, with the backing of the Japanese Support Unit for the IPCC, but was refused. So we will just have to wait with bated breath to see what, if anything, the "IPCC Experts" make of it. While the list of invitees is very worthy, is disappointing to see that so many of them are members of the same old cliques, with no fewer than 4 participants from the Hadley Centre, and three each from NCAR, CSIRO and PCMDI, and vast numbers of multiply co-authored papers linking many of the attendees together. Those 4 institutes alone provide almost a quarter of the scientists invited. Coincidentally (or not), staff from these institutes also filled 5 of the 7 places on the organising committee... Shame they couldn't find space for even one person from Japan's premier climate science institute.

Sunday, December 13, 2009

Statement from the UK science community

I'm a little surprised to have not seen more mention of this in either the mainstream media or even on blogs:

"We, members of the UK science community, have the utmost confidence in the observational evidence for global warming and the scientific basis for concluding that it is due primarily to human activities. The evidence and the science are deep and extensive. They come from decades of painstaking and meticulous research, by many thousands of scientists across the world who adhere to the highest levels of professional integrity. That research has been subject to peer review and publication, providing traceability of the evidence and support for the scientific method.

The science of climate change draws on fundamental research from an increasing number of disciplines, many of which are represented here. As professional scientists, from students to senior professors, we uphold the findings of the IPCC Fourth Assessment Report, which concludes that ‘Warming of the climate system is unequivocal’ and that ‘Most of the observed increase in global average temperatures since the mid-20th century is very likely due to the observed increase in anthropogenic greenhouse gas concentrations’."

What is perhaps most impressive about this is that the signatures were collected in under a week, and the 1700+ signatories (from UK institutes alone) hugely outnumbers the total authorship of the IPCC WG1 report of 619 people (even that figure is dominated by the "contributing authors" such as myself who had no direct input into the writing process).

Saturday, June 16, 2007

IPCC enters the 21st century

According to the latest missive from Michael Manning, the IPCC TSUs have agreed that in future comments and responses will be available in pdf format (see here and here for previous). I'm not 100% sure from his wording if this strictly applies only to future reports (ie starting with the next assessment), or includes the AR4. But in any case, he's specifically stated that I'll get pdfs (for the chapters I asked about) personally in a few days at latest. Which, needless to say, I am very pleased to hear.

Update Tuesday 19th

On Monday, the comments arrived...in dead tree form. Assuming a minor cock-up rather than obstruction, I sat on my hands, and overnight the pdfs appeared in my inbox (in fact I also got the comments on the same chapters of the 1st draft which I don't remember specifically asking for). I'd like to publicly thank the IPCC Secretariat for responding in a sensible way and in a reasonable time frame.

Friday, June 08, 2007

Comments coming

According to Martin Manning, the comments are on their way (well, they will be sent shortly). This is apparently a special short-term offer to "expert reviewers" only. The long-term "open archive" will be at Harvard as previously described.

No doubt they will arrive in sequestration-ready dead tree form, which is hardly convenient or sensible, but there's no point banging that drum any more given that I'm getting the information that I asked for. The IPCC secretariat are obviously desperate to avoid widespread dissemination of the comments, they have even explicitly asserted that these copies are "not for redistribution to others".

Some links