How to treat expert judgment? With certainty it contains uncertainty!

https://doi.org/10.1016/j.jlp.2020.104200Get rights and content

Highlights

  • In quantitative risk assessments reliable event/failure probability data are often lacking.

  • Expert judgement may be of help, even to supplement or evaluate historical data.

  • Expert opinions may differ or conflict, requiring rational/objective aggregation methods.

  • Multiple methods to cope with this have been developed but are rarely used in QRA.

  • This paper explains foundations of methods mostly used and provides simple examples.

Abstract

To be acceptably safe one must identify the risks one is exposed to and decide what risk reducing measures are required. It is uncertain whether the threat really will materialize, but determining the size and probability of the risk is also full of uncertainty. When performing an analysis and preparing for decision making under uncertainty, quite frequently failure rate data, information on consequence severity or on a probability value, yes, even on the possibility that an event can or cannot occur, is lacking. In those cases, a possible way and sometimes the only way to proceed is to revert to expert judgment. Even in case historical data is available, an expert can be asked whether and to what extent such data still hold in the current situation.

Anyhow, expert elicitation comes with an uncertainty depending on the expert's reliability, which becomes very visible when two or more experts give different answers or even conflicting answers. This is not a new problem, and very bright minds have thought how to tackle this in a rational and objective way. But so far, however, the topic has not been given much attention in daily process safety and risk assessment practice. Therefore, this paper has a review and applied character and will present various approaches with detailed explanation and examples.

Introduction

It seems so easy: once in case of a risk analysis, the complexity of the causal structure has been solved and, e.g., a bowtie is drawn up, as a next and final step data or other information is needed. For example, the probability of failure on demand of a critical component, e.g., a pressure relief valve, must be filled in. Also, other questions may have to be solved, such as the likelihood of an explosion or a fatality if a safety device fails. In general, risk assessment either in design stage or during process operations to prepare a decision whether safety is adequate, suffers much from inaccuracy (see e.g., Granger Morgan and Henrion, 1990, or Pasman et al., 2009, 2017, and other articles in Safety Science, 2017, Vol. 99 on Risk Analysis Validation and Trust in Risk Management). Part of it is due to uncertainty of inputs to models. For example, despite all measures a question can be what is the chance of a reactor run-away or a fire in the next ten years, or what is the probability this equipment will run another year, or given the configuration what is the chance the operator will fail to restore stable conditions when this particular upset event will happen. One knows that manufacturer data are often too optimistic, while the well-founded OREDA (2015) equipment failure data base may provide an answer that may not be valid under the conditions a certain valve is applied in an actual process. In the best case, there is in the plant some historical data available on similar valves, but those valves are of a different brand creating another uncertainty. In the event of human failure, data are very uncertain, and many factors may play a role. In all such cases, as a last resort, a few people familiar with the installation and employed already a considerable number of years at the plant, may be asked to provide an estimate. This reverting to expert judgment can have different forms. Easiest for the expert may be to produce a linguistic grading term on, e.g., a likelihood value using a 5-point Likert scale: ‘very high’, ‘high’, ‘medium’, ‘low’, ‘very low’, or just a single figure, but knowing there will be uncertainty the interviewer may invite the experts to specify an interval with both a lower and a higher bound. Even more sophisticated is to ask for an estimate of a mean value and a confidence or credibility interval. Experts may also be asked to give an opinion on consequence events, damage to assets, or other types of losses, dependencies in models, and with what chance an event can be prevented by installing certain equipment. The same type of questions will arise when considering cyber-attacks. For all those reasons, expert judgment as a possible input is also anchored in the first editions of Risk Management standards ISO 31000:2009 and IEC/ISO 31010:2009. In view of the effort the process of expert elicitation will take, it should be verified by sensitivity analysis on the forehand, whether the accuracy of the input is sufficiently influencing the final result of the assessment.

In the 1950's RAND corporation in Santa Monica, CA developed the Delphi method and recently developed an on-line version of which an application example is presented by Armstrong et al. (2019). The Delphi method has been in use for many years to solve all kinds of questions and is described with examples in extenso by Linstone and Turoff (1975). The technique is basically an iteration process toward consensus among a group of experts based on survey questions about the topic developed by a small monitor team. Although in many cases participants have been satisfied with the answers to the questions posed, weaknesses of the technique are rather numerous. Experience can be fallible, participants can influence each other, and cognitive biases are numerous (Montibeller and Von Winterfeldt, 2015). Although the method is largely used for qualitative forecasting, it may also be used to predict a parameter value.

In a wider context, questioning experts serves to support optimum decision making. There are many methods developed for that purpose, for example Saaty's (1990, 2006) method of Analytical Hierarchy Process (AHP) developed in the 1970s, in which participants must make pairwise selection between alternatives on the basis of a number of criteria. More sophisticated is multi-attribute utility theory to make a decision based on value judgments of multiple, competing objectives. However, this paper will treat just the use of expert judgment in quantitative risk assessment, and more specifically in estimating parameter values and event frequencies. Baybutt (2017) acknowledges the use of engineering judgment for the purpose of hazard and risk analysis but describes 28 different relevant cognitive biases that analysis team members can suffer from, and he gives advice how to attenuate the effects. It will be clear that expert opinion is shrouded in uncertainty. Because the aim is making a reliable risk prediction, first the concept of uncertainty must be considered and methods then selected that enable estimates of uncertainties.

Uncertainty is usually categorized by two types, although distinction is not always acceptably clear, and a variable can contain both types of uncertainty at the same time:

  • Aleatory uncertainty, by which due to lack of accuracy/precision of observational means, in general, random variability due to undefined conditions, an outcome cannot be established accurately, and,

  • Epistemic uncertainty, which is a consequence of lack of knowledge about the subject due to the amount and quality of the data. The greater the amount and quality of the data the lower the epistemic uncertainty.

In particular, the epistemic uncertainty will be addressed here.

Already in the 1900s, statisticians engaged in intense deliberations on how to deal with uncertainty. Oldest is the so-called frequentist idea of probability in which a repeatable experiment is performed with an outcome that can take different forms or values. A classic case is an urn with red and white balls and by drawing random samples predicting the fraction of red without counting them all. The approach developed to a collection of mathematical distribution functions in which observation results could be fitted with confidence limits dependent on the variability of draws and the size of the sample. It also led to methods with which based on observations can be determined to what significance level a hypothesis is true or not. In the second half of last century, though, the Bayesian approach gained strength, in which all previous information/knowledge can be cast into a prior distribution, while new evidence is represented by a likelihood probability function. The result is an update of the prior to a posterior distribution based on the normalized co-occurrence of the prior and the likelihood distributions. The Bayesian model provides many more possibilities in solving problems than frequentist statistics and has become the leading approach for evidence-based testing of a hypothesis or an event.

By the end of 1980s, Klir (1989) in the Cambridge Debate on Uncertainty, wrote a clear synthesizing paper against the claim that probability, as traditionally defined, the standard approach, is the only concept to describe uncertainty. He did this with a counterclaim that one must go beyond only probability. Klir starts off with distinguishing two types of uncertainty:

  • Vagueness, encompassing: fuzziness, haziness, cloudiness, unclearness, indistinctiveness, sharplessness, and indefiniteness, and,

  • Ambiguity, comprising: non-specificity, variety, generality, diversity, divergence, equivocation, incongruity, discrepancy, dissonance, and disagreement.

He continues by mentioning that imprecision can relate to both vagueness and ambiguity, while in the latter non-specificity and disagreement are again different. After analyzing the matter in much detail, Klir (1989) concludes that probability conceptualizes “uncertainty strictly in terms of conflict among degrees of belief allocated to mutually exclusive alternatives”; in other words, a probability P of an event to occur or quantity intrinsically holds the contrast that the probability the event will not occur or the quantity will be different, will be the complement 1P.

In many situations, though, when asked in human dialogue to make an estimate or a prediction of the probability of an event occurrence, a person produces an answer based on intuition supported by experience. However, when asked how sure it is the event is not probable the two answers may not add up to unity. Such probability estimates are called subjective or imprecise.

In fact, in the 1970s the Dempster-Shafer theory of evidence addressed this kind of human imprecision. The theory was developed by Dempster (1967) and later by Shafer (1976) introducing the concepts of belief and plausibility to be explained in more detail later.

Even earlier, in the 1960s Zadeh (1965, 1975) developed fuzzy set and logic in which an interviewer will obtain an unsure fuzzy answer represented by a membership function. This function has value 1 at the most likely estimate and 0 at the extremes beyond which the estimate value is believed to be not possible, while in between the function can have any shape.

Imprecision fits well with the Bayesian approach, which works with information of all uncertainty levels and propagates the uncertainties from the prior and likelihood to the posterior. With new evidence in the likelihood, the prior is updated to a posterior result based on more information and therefore with lower uncertainty than the prior.

More recently, Helton and Johnson (2011) summarized the alternative representations of epistemic uncertainty at increasing structure and quantification as follows: (1) Interval analysis, just providing a low and high boundary with no information in between (uniform distribution with all values considered equally likely); (2) Possibility theory, consisting of a set of possible elements to each of which a likelihood value can be attached together forming a possibility distribution, which is related to the Fuzzy set approach; (3) Evidence theory (Dempster-Shafer), which specifies a limited number of focal elements, while each element is given a measure of credibility (basic assignments or basic belief assignments summing to 1, confusingly also called basic probability assignment); (4) Probability theory, involving element probabilities in a fully developed structure embodied by a probability density function.

At the probability end of the spectrum, Cooke (1991) in the 1980s developed so-called structured expert judgment by asking experts to specify a mean and confidence limits. As we shall see in Section 4, processing of the data is rather intricate.

Whatever method of interrogating and interpreting experts’ answers, disagreement among experts will be common. So, in all approaches a method of aggregation is required to deal with independently obtained different replies.

In the remainder of the paper, we shall restrict ourselves to the more practical aspects of expert estimation. The objective of the paper is for the general risk assessor to become easily familiar with the methods, so that some parts are explained in more detail than needed for a specialist. In Section 2 we shall describe the Dempster-Shafer approach, and in Section 3 the Fuzzy set and logic one, both with some examples. In Section 4 Cooke's method will be outlined, and in Section 5 similarities and differences, also in required effort, will be noted, followed by a few brief summaries of newer additional methods, and Section 6 Conclusions.

Section snippets

Dempster Shafer Theory (DST) of evidence

In various publications, Shafer (1976, 1990) explains the original idea of belief functions and evidential reasoning in case a human makes a statement about an event, fact, or value. It encompasses belief, doubt, plausibility, disbelief, and ignorance, which are all associated with uncertainty. For example, if a person (here called expert) asserts that a certain event took place or is going to take place, it does not mean that there is no space to believe it did not occur or is not going to

Type-1 fuzzy sets

Zadeh's (1965) fuzzy sets and logic for dealing with uncertainty have become quite known. In the mid-1990s Klir and Yuan (1995) showed their applicability. In the late 1990s and in this century among others, Mendel (2017) broadened and deepened the concept. Application is relatively straightforward and wide-spread, and it is used in risk assessment to estimate parameter values, also as index values or even to express linguistic grades of, e.g., consequence severity and event frequency

Probabilistic approach of expert estimation

Expert elicitation applying a probabilistic approach has a long history. It had already started in the 1970s with trying to solve problems in nuclear risk assessment (Rasmussen, 1975). Perhaps the latest is applying it in the field of climate change (Oppenheimer et al., 2016) claiming that structured expert judgement is applied “in order to facilitate characterization of uncertainty in a reproducible, consistent and transparent fashion.” That is “experts quantify their uncertainty on

Similarities and differences in methods, incl. Additional ones

The probabilistic approach set out above shows again that if one wants to decrease uncertainty, efforts must increase strongly. One does not get more value cheap! Attempts to ‘calibrate’ experts is afflicted with complexities. In the Dempster-Shafer Theory of evidence or the fuzzy set approach, though, expert weighting before aggregation of opinions is also an option. In DST it is the analyst, who by assigning mass gives experts a (subjective) weight, while in fuzzy set and logic expert

Conclusions

Often in quantitative risk assessments a need rises to call on experts to provide data. However, expert estimates are subjective and, due to uncertainties, imprecise. Methods are available to objectivize imprecise and subjective estimates of mostly binary variable values that in principle are observable, but because of long observation lead times or other reasons of inaccessibility, must be obtained by interviewing experts and eliciting their opinions.

Most forceful on the experts and at the

Credit author contribution statement

Both authors contributed equally.

Author statement

Hans Pasman and William Rogers selected and discussed the material and have done the writing. Hans Pasman has been submitting the paper.

Declaration of competing interest

We have no conflicts of interest.

Acknowledgement

Comments of anonymous reviewers stimulated to significantly improve the paper.

References (64)

  • A.S. Markowski et al.

    Fuzzy logic for process safety analysis

    J. Loss Prev. Process. Ind.

    (2009)
  • A.S. Markowski et al.

    Uncertainty aspects in process safety analysis

    J. Loss Prev. Process. Ind.

    (2010)
  • M. Naderpour et al.

    An abnormal situation modeling method to assist operators in safety-critical systems

    Reliab. Eng. Syst. Saf.

    (2015)
  • H.J. Pasman et al.

    Is risk analysis a useful tool for improving process safety?

    J. Loss Prev. Process. Ind.

    (2009)
  • H.J. Pasman et al.

    Risk assessment: what is it worth? Shall we just do away with it, or can it do a better job?

    Saf. Sci.

    (2017)
  • J. Pearl

    Reasoning with belief functions: an analysis of compatibility

    Int. J. Approx. Reason.

    (1990)
  • ThL. Saaty

    How to make a decision: the analytic Hierarchy process

    Eur. J. Oper. Res.

    (1990)
  • G. Shafer

    Perspectives on the theory and practice of belief functions

    Int. J. Approx. Reason.

    (1990)
  • M. Sugeno et al.

    Structure identification of fuzzy model

    Fuzzy Set Syst.

    (1988)
  • R.R. Yager

    On the dempster-shafer framework and new combination rules

    Inf. Sci.

    (1987)
  • M. Yazdi et al.

    A methodology for enhancing the reliability of expert system applications in probabilistic risk assessment

    J. Loss Prev. Process. Ind.

    (2019)
  • L.A. Zadeh

    Fuzzy sets

    Inf. Contr.

    (1965)
  • L.A. Zadeh

    The concept of a linguistic variable and its application to approximate reasoning

    Inf. Sci.

    (1975)
  • L.A. Zadeh

    Fuzzy sets as a basis for the theory of possibility

    Fuzzy Set Syst.

    (1978)
  • L.A. Zadeh

    A note on Z-numbers

    Inf. Sci.

    (2011)
  • C. Armstrong et al.

    Participant experiences with a new online modified-Delphi approach for engaging patients and caregivers in developing clinical guidelines

    Eur. J. for Person Centered Healthcare

    (2019)
  • P. Baybutt

    The validity of engineering judgment and expert opinion in hazard and risk analysis: the influence of cognitive biases

    Process Saf. Prog.

    (2017)
  • F. Bolger et al.

    The aggregation of expert judgment: do good things come to those who weight?

    Risk Anal.

    (2015)
  • O. Castillo et al.
    (2008)
  • R.M. Cooke

    Experts in Uncertainty, Opinion and Subjective Probability in Science

    (1991)
  • F.G. Cozman

    JavaBayes – User Manual

    (2000)
  • A.P. Dempster

    Upper and lower probabilities induced by a multivalued mapping

    Ann. Math. Stat.

    (1967)
  • Cited by (18)

    • Emerging structural adhesive chemistries and innovations

      2023, Advances in Structural Adhesive Bonding, Second Edition
    • Risk-based and predictive maintenance planning of engineering infrastructure: Existing quantitative techniques and future directions

      2022, Process Safety and Environmental Protection
      Citation Excerpt :

      Among all contributions, a few studies were selected to illustrate how fuzzy set theory can improve expert judgment. Pasman and Rogers (2020) reviewed various experts judgment approaches detailing to show how it can be treated. The authors indicate that the expert elicitation procedure is associated with a large degree of uncertainty, which depends on the reliability of expert knowledge.

    View all citing articles on Scopus
    View full text