Interfaces with Other DisciplinesMulti-factor dependence modelling with specified marginals and structured association in large-scale project risk assessment
Introduction
This paper tackles the problem of dependence modelling for large-scale project risk assessment. Dependence modelling constitutes an essential element of risk-adjusted project planning and predictive control, in particular for probabilistic cost estimates (GAO, 2020; Garvey, Book & Covert, 2016), stochastic network schedules (GAO, 2015; Trietsch, Mazmanyan, Gevorgyan & Baker, 2012; van Dorp, 2005), project-end outcome updates (Cho, 2009; Kim, 2015), and predictive performance tracking (Kim & Kwak, 2018). Inter-dependence between project tasks is also one of the driving factors of project complexity along with the project size and the variety of tasks (Baccarini, 1996; Tatikonda & Rosenthal, 2000). Consequently, accounting for the nature and the degree of dependence is a demanding challenge for proper management of modern projects with increasing complexity and structural uncertainty (Mo, Yin & Gao, 2008; Williams, 1999).
The need for quantitative risk assessment as a decision support tool has been well recognized since several seminal papers in capital investments (Hertz, 1964) and operations research (Malcolm, Roseboom, Clark & Fazar, 1959; Van Slyke, 1963). In practice, however, projects often behave in a way that clashes with what is expected from the best practices and standards for successful completion in time (Love, Wang, Sing & Tiong, 2013; Schonberger, 1981) and within the budget (Flyvbjerg, 2006; Love, Sing, Wang, Edwards & Odeyinka, 2013). Elementary statistics shows that project cost and time, as a risk-adjusted sum of random variables, tend to exceed the sum of their isolated marginal estimates when there exist positive inter-variable associations. Empirical data also suggest that (i) inter-variable correlations are commonly observed and (ii) ignoring correlation leads to systematic underestimation of the real risks (Chau, 1995; Newton, 1992; Skitmore & Ng, 2002). Moreover, as the uncertainty dimension increases the percent underestimation of the total cost (or time) drastically increases (Garvey et al., 2016, p.322). Subsequently, a proper consideration of inter-variable dependence is widely emphasized as a crucial element of contingency settings and project risk assessment in general (GAO, 2015, p.115; GAO, 2020, p.155; NASA, 2013, pp.33–37).
In theory, dependence modelling can be straightforward. In a narrow sense, a vector of dependent random variables can be specified as a mixture of univariate marginals (X) and the corresponding correlation matrix (ΣX).
The correlation-driven dependent vector (XΣ) in Eq. (1) provides a mathematically rigorous representation. However, specifying a feasible correlation matrix is a data-intensive process. In practice, the burden of data collection for correlation specification can be unattainably challenging (Lurie & Goldberg, 1998). In particular, high-dimensional dependence modelling can be overly restrictive, mostly due to three well-known challenges, which can be collectively referred to as the curse of dimensionality. First of all, the number of pairwise correlations required to fully specify a correlation matrix increases quadratically. Although the general perception of large-scale projects changes over time, projects with thousands or more activities are becoming increasingly common in practice (GAO, 2015, pp.102–104; Safran, 2020). For example, a risk model with 1000 variables requires assessments of 1000C2 = 499,500 correlation coefficients. The burden of data collection in this scale, either from historical data or with expert judgment, would be practically unattainable, or simply not economical. Even more challenging, there are also situations where pair-wise correlations are restricted by the selection of marginal distributions (Demirtas & Hedeker, 2011; Lurie & Goldberg, 1998). A more detailed discussion on the curse of dimensionality will be presented in Section 2.
A sensible way of dealing with the curse of dimensionality is to reconstruct the problem in a way that reduces the data collection and elicitation burden (Morgan, Henrion & Small, 1992). A decision maker may conveniently avert the dimensionality issues by adopting drastic simplification assumptions, while sacrificing the flexibility of representing various dependence combinations (Goh & Sim, 2011; Trietsch et al., 2012). In the project control literature, Bayesian networks were examined as an analytic framework for factor modelling and adaptive project time updating (Cho, 2009; van Dorp, 2020). Cho (2009) presented a single-factor Bayesian model in which all activities in a project are influenced by a single resource factor. van Dorp (2020) also proposed a single-factor dependence model that employs a new family of power distributions, the two-sided power distributions, to represent the mode of a PERT (program evaluation and review technique) distribution. As a robust solution for large-scale risk analysis, however, single-factor approaches can be overly restrictive in the way that all pairwise correlations in the analysis are calibrated by a single factor. In these regards, a dependence model can be considered more realistic when it offers the flexibility of accounting for multiple risk factors commonly observed in real project settings.
Methodologically, however, increasing the number of risk factors for dependence specification is also prohibited by the quantity and quality of the data available for corresponding parameter estimation. Whenever available, empirical data from past projects or expert assessments should be used. When a project is more predictable with a plethora of similar projects in the past, empirical data or subjective assessments by experts can be used. At the same time, there are more challenging projects with unique scope, innovative methodologies, and increased complexities in terms of component interfaces and project scales. These projects are as a rule less predictable and can be hardly characterized with quantitative data collected from past projects. Consequently, the nature and degree of risks inevitable in such one-of-a-kind projects cannot be fully quantified using empirical data alone. Here we observe a dilemma, somewhat inevitable in project risk assessment: the less there exist relevant empirical data from similar projects, the more the need for a sensible risk assessment increases. As a viable alternative, subjective assessments of the pair-wise correlations can be employed. Yet, the efficacy of subjective correlation assessments rapidly diminishes as the number of random variables increases mostly due to the mathematical consistency required for a feasible correlation matrix (See Section 2.1 for more details of this issue).
These observations indicate that the robustness of a solution to the dependence modelling problem for large-scale project risk analysis can be enhanced with three analytical features: multi-factor capability, applicability under limited empirical data, and dimensional scalability. Accordingly, the objective of this article is set to present a multi-factor dependence modelling framework that provides the flexibility of addressing the limited data availability, while preserving the scalability to high-dimensional project risks. To achieve this goal, we investigate a dependent vector that can be fully specified with three input elements:where b = (b1,…,bd)T is a vector of observable random variables of which marginals, , (i = 1,…,d) are specified independently prior to accounting for possible dependence; r = (r1,…,rK)T, is a vector of association factor (AF) variables that are elicited and specified as the proxy of the pairwise dependence between the base variables (b); and Ψ= [ψik] (i = 1,…,d; k = 1,…,K) is a d × K allocation matrix (AM) of binary elements, which defines the relationships between b and r.
Specifically, we present an analytic framework with two stages: the structured association (SA) and the multi-factor association model (MFAM). First, the SA establishes a hierarchical structure of all relevant AFs identified in a project, providing a qualitative solution to the multi-factor capability and the applicability to limited data for a robust dependence model. Then, the MFAM transforms the qualitative SA information into a quantitative, mathematically consistent correlation matrix (Σr) of a vector of specified marginals. Adopting analytic (non-simulation) approaches (i.e., the second-moment approach), the MFAM offers a computationally efficient algorithm, readily scalable to high dimensional risk modelling and analysis.
Note that inter-variable association may arise due to causal relationship between variables or a common factor that may affect two or more variables concurrently (Bolstad, 2007, p.3). The key premise underlying our approach is that by selecting the AFs wisely a decision maker is able to establish a balance between the data availability and the modelling flexibility, while effectively mitigating the curse of dimensionality. A selection of the AFs can be considered wise if the marginal distributions of the factors and the corresponding allocation matrix can be elicited and specified using all relevant information readily available in standard project settings. In this paper, we focus on establishing stochastic association between project performance units (i.e., itemized costs and activity times, hereinafter PUs) using all relevant information accessible in standard project environments, for example, the work breakdown structure (WBS), resource plans, and a risk register (PMI, 2013, p.163). It should be properly emphasized that project risk information from various sources is often available in unstructured forms (e.g., drawings, organizational plans, and resource plans) (Xing, Zhong, Luo, Li & Wu, 2019). In particular, a risk register is used to identify and track all relevant risks in a project and their attributes relevant to project outcomes (PMI, 2013). The information embedded in such project plans is conspicuously observable and thus objective. Yet project plans exist, as a rule, in qualitative formats. In this study, we adopt an ontological approach to transform any qualitative association information reflected in project plans into quantitative dependence information expressed as a correlation matrix. Ontology, as a branch of philosophy, offers a flexible perspective on the evolving nature of projects (Morris, 2013, p.236). In analytic settings, ontology allows a framework that represents the knowledge in a domain as a set of concepts and their relationships (Rodger, 2013) and has attracted growing attention in risk studies, for instance, in safety (Xing et al., 2019), supply chain (Palmer et al. 2018), and environment (Scheuer, Haase & Meyer, 2013).
The main contributions of this article can be highlighted in three aspects.
- •
The SA-MFAM approach enhances the realism of dependence modelling by offering the flexibility of accounting for multiple risk factors observed in individual projects based on all relevant information readily available in individual projects.
- •
The MFAM yields an analytic, closed-form solution to the correlation matrix, which can be further parameterized and calibrated meeting the limited data availability in individual projects.
- •
Adopting a factor-driven approach, the MFAM always generates a mathematically consistent matrix, preserving the scalability to high-dimensional risk modelling and analysis.
The rest of this article is organized as follows. The following section outlines the challenges in large-scale dependence modelling and presents the SA technique as a viable solution. Section 3 formulates the MFAM. In Section 4, we carry out a set of credibility tests and evaluate the performance of MFAM against Monte Carlo simulation. Section 5 demonstrates the implications of dependence (or ignoring dependence) in project decision making using three SA-MFAM applications. Conclusions and future research issues are summarized in Section 6.
Section snippets
Large-scale correlation assessment
Uncertainty models, in a generic setting, have a form of a joint distribution with three components: (i) a function of random variables in d-dimension, G(X), X = (X1,…,Xd)T, (ii) a set of univariate marginal distributions, and (iii) a set of dependence parameters, mostly in terms of a correlation matrix (i.e., a symmetric, positive semi-definite matrix with unit diagonal elements), .
In large-scale project risk analyses, the efforts to develop a
Multi-factor association model
The SA-induced dependence between PUs involves, as a rule, multiple factors. This section presents a multi-factor association (MFA) model that transforms the relative association information embedded in a hierarchical set of multiple AFs into a mathematically consistent correlation matrix. Characteristic properties and practical implementation options of the MFA model are also highlighted.
Credibility test
A test project is analyzed to demonstrate the performance (i.e., accuracy, robustness, and computational efficiency) of the MFA analysis as compared against Monte Carlo simulation. The project parameters are designed in a way that challenges the primary premise underlying the MFAM (i.e., the MOM approximation). Specifically, the test settings account for three control factors: (i) asymmetricity of variables, (ii) effects of base distribution types (PERT-beta vs. triangular distribution), and
Applications
Empirical data from previous projects or general experiences provide valuable, although often incomplete, dependence information relevant to a new project (Ranasinghe, 2000; Touran & Wiser, 1992; Wang & Huang, 2000). It would be sensible then to make full use of all relevant information whenever possible. The MFA model's closed-form correlation matrix provides a robust solution to the problem of utilizing incomplete dependence information for coherent risk analysis. This section presents three
Conclusions
Proper dependence consideration is crucial for realistic project risk assessment and making informed decisions under uncertainty. We present an analytic framework that combines a systematic way of accounting for multiple risk factors (the SA) and a quantitative dependence model based on the second moment approach (the MFAM). The SA-MFAM offers an analytic, closed-form, and computationally tractable alternative to the correlation-driven approaches for dependence modelling. Specifically, the
References (56)
- et al.
Integrating risk into estimations of project activities’ time and cost: A stratified approach
European Journal of Operational Research
(2021) The concept of project complexity—A review
International Journal of Project Management
(1996)A linear Bayesian stochastic approximation to update project duration estimates
European Journal of Operational Research
(2009)- et al.
Hadamard powers and totally positive matrices
Linear Algebra and its Applications
(2007) A fuzzy linguistic ontology payoff method for aerospace real options valuation
Expert Systems with Applications
(2013)- et al.
Towards a flood risk assessment ontology–Knowledge integration into a multi-criteria risk assessment approach
Computers, Environment and Urban Systems
(2013) - et al.
Using schedule risk analysis with resource constraints for project control
European Journal of Operational Research
(2021) Hadamard products and multivariate statistical analysis
Linear Algebra and its Applications
(1973)The effect of systemic errors on optimal project buffers
International Journal of Project Management
(2005)- et al.
Modeling activity times by the Parkinson distribution with a lognormal core: Theory and validation
European Journal of Operational Research
(2012)
Statistical dependence through common risk factors: With applications in uncertainty analysis
European Journal of Operational Research
A dependent project evaluation and review technique: A Bayesian network approach
European Journal of Operational Research
A new approach to calculating project cost variance
International Journal of Project Management
Expert judgement for dependence in probabilistic modelling: A systematic literature review and future research directions
European Journal of Operational Research
The need for new paradigms for complex projects
International Journal of Project Management
Ontology for safety risk identification in metro construction
Computers in Industry
Vines: A new graphical model for dependent random variables
Annals of Statistics
Introduction to Bayesian statistics
Modeling and generating random vectors with arbitrary marginal distributions and correlation matrix
Monte Carlo simulation of construction costs using subjective data
Construction Management & Economics
Assessing dependence: Some experimental results
Management Science
Statistical power analysis for the behavioral sciences,
A practical way for computing approximate lower and upper correlation bounds
The American Statistician
From Nobel prize to project management: Getting risks right
Project Management Journal
GAO Schedule assessment guide: Best practices for project schedules
Applied research and methods
GAO Cost estimating and assessment guide: Best practices for developing and managing capital program costs
Applied research and methods
Probability methods for cost uncertainty analysis: A systems engineering perspective
Behavior of the NORTA method for correlated random vector generation as the dimension increases
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Cited by (4)
A method based on the theories of game and extension cloud for risk assessment of construction safety: A case study considering disaster-inducing factors in the construction process
2022, Journal of Building EngineeringCitation Excerpt :Choudhry et al. [32] analysed the unsafe behaviour of construction workers. To evaluate large-scale projects, Kim [33] used multidimensional risk factors that comprehensively summarise the safety status of the project sites. However, relevant studies typically employed evaluation methods that do not consider the safety inspection data for construction sites.
Risk Assessment of Bridge Construction Project Based on Fast ICA Algorithm
2023, 2nd IEEE International Conference on Distributed Computing and Electrical Circuits and Electronics, ICDCECE 2023Anatomy of interactions among risk factors influencing implementation of building information modeling (BIM): a system dynamics approach
2023, Engineering, Construction and Architectural ManagementFactors Affecting Evaluation of Railway Bulk Freight Rate: A Novel Cloud Theory-Based Approach
2022, Journal of Advanced Transportation