This week it was a workshop in Bristol organised by the NERC scoping study on risk and uncertainty in natural hazards (SAPPUR) led by Jonty Rougier of the BRisk Centre at Bristol University. The study is due to report at the end of November, with a summary of the state of the art in different areas of natural hazards and suggestions for a programme of research and training to be funded by NERC. This will have relevance for all three focus areas in CCN, including the specification of trading needs.
It will not be surprising that many of the issues overlap with those that arose in the sessions at Hyderabad (see last entry). The discussions touched on the definition of risk, the assessment of model adequacy, the quantification of hazard and risk, and techniques for the visualisation and communication of uncertainties. There were interesting presentations from David Spiegelhalter on methods used in the medical sciences and Roger Cooke on methods used in the elicitation of expert opinions.
John Rees, the NERC Theme Leader for Natural Hazards, raised the following questions that he felt were important for this scoping study to address:
- If model uncertainty is needed to better inform policy decisions how is it best quantified?
- How should alternative conceptual models and evidence contradictions be used in policy and decision making?
- Is the mean value the appropriate safety metric to inform decisions?
- What is best way to represent scientific consensus?
- What are useful mechanisms for integrating risk and uncertainty science into policy development?
- What should be addressed by the research councils (there is a provisional budget of £1.5m available to support the research programme)?
There was a general recognition amongst the participants, who covered a range of different natural hazards, that the proper evaluation of hazard and risk is often difficult, in that we often have only sparse or no data with which to try and quantify sources of uncertainty and that there may be many different alternative predictive models of varying degrees of approximation. These are the epistemic uncertainties but there was not much discussion about how these might be reflected in the quantification of risk. Many participants seemed to accept that the only way to attempt such a quantification was using statistical methods. I am not so sure.
It is true that any assessment of uncertainty will be conditional on the implicit or explicit assumptions made in the assessment (which might involve treating all sources of uncertainty as if they can be treated statistically). It is also true that those assumptions should be checked for validity in any study (though this is not always evident in publications). But if the fact that the uncertainties are epistemic means that the errors are likely to strong structure and non-stationarity that will depend on a particular model implementation, then it is possible that alternative non-statistical methods of uncertainty estimation might be appropriate.
I have been trying to think about this in the context of testing models as hypotheses given limited uncertain data (something that frequently arises in the focus areas of CCN). Hypothesis testing means considering both Type I and Type 2 errors (accepting false positives and rejecting false negatives). An important areas of CCN is how to avoid both types of errors in model hypothesis testing so that in prediction we are more likely to be getting the right results for the right reasons. So an interesting question is what constitutes an adequate hypothesis test, adequate in the sense of being fit for purpose. This question was addressed, at least indirectly, by Britt Hill of the US Nuclear Regulatory Commission in a talk about the performance assessment process for the safety case for the Yucca Mountain repository site.
In that study, Monte Carlo simulation was used to explore a wide range of potential outcomes (in terms of future dose of radioactivity to a local population over a period of the next 1 million years or so. A cascade of model components from infiltration to waste leaching was involved in these calculations, each depending on multiple (uncertain) model components. The Monte Carlo experiments spanned a range of alternative conceptual models and possible model parameters. Decisions about which models to run appeared to have been produced by scientific consensus, something that Roger Cooke had earlier suggested was not necessarily the best way of extracting information from experts.
There is no explicit hypotheses testing in this type of approach, only some qualitative assessment of whether performance is “reasonably supported” in terms of predictions in past studies, history matching, scientific credibility etc. But it is sometimes the case that, for whatever, reasons, even the best models do not provide acceptably predictions for all times and all places. This could be because of errors in the forcing data, it could be because of model structural error, it could be because of error in the data with which a model is evaluated. It remains impossible to really separate out these different sources of error, and it therefore means that it is difficult to do rigorous hypothesis testing for this type of environmental model.
This seems to be an area where further research is needed. It is surely important in developing guidance for model applications within each of the three CCN focus areas…
Their results so far, the researchers say, suggest that the answer to both questions might be yes