The ensemble approach to forecasting: A review and synthesis

doi:10.1016/j.trc.2021.103357

Transportation Research Part C: Emerging Technologies

Volume 132, November 2021, 103357

https://doi.org/10.1016/j.trc.2021.103357 Get rights and content

Highlights

•
Review and synthesize methods of ensemble forecasting with a unifying framework.
•
As decision support tools, ensemble models systematically account for uncertainties.
•
Ensemble methods can include combining models, data, and ensemble of ensembles.
•
Transport ensemble models have the potential for improving accuracy and reliability.

Abstract

Ensemble forecasting is a modeling approach that combines data sources, models of different types, with alternative assumptions, using distinct pattern recognition methods. The aim is to use all available information in predictions, without the limiting and arbitrary choices and dependencies resulting from a single statistical or machine learning approach or a single functional form, or results from a limited data source. Uncertainties are systematically accounted for. Outputs of ensemble models can be presented as a range of possibilities, to indicate the amount of uncertainty in modeling. We review methods and applications of ensemble models both within and outside of transport research. The review finds that ensemble forecasting generally improves forecast accuracy, robustness in many fields, particularly in weather forecasting where the method originated. We note that ensemble methods are highly siloed across different disciplines, and both the knowledge and application of ensemble forecasting are lacking in transport. In this paper we review and synthesize methods of ensemble forecasting with a unifying framework, categorizing ensemble methods into two broad and not mutually exclusive categories, namely combining models, and combining data; this framework further extends to ensembles of ensembles. We apply ensemble forecasting to transport related cases, which shows the potential of ensemble models in improving forecast accuracy and reliability. This paper sheds light on the apparatus of ensemble forecasting, which we hope contributes to the better understanding and wider adoption of ensemble models.

Introduction

The concept underlying ensemble forecasting has existed since time immemorial, such that most cultures have expressions for the ‘wisdom of the crowd’ in their language. In many disciplines and applications, however, modeling still relies on a single model. Most transport models are theory-driven ‘Newtonian’ physical models (Garrison and Levinson, 2014), that are used to represent what is theorized to be the true mechanism, and models with lower performance are labeled simply as ‘incorrect’. The rationale for applying physical models in transport is the assumption that a ‘true’ model really exists, and can be described using a single mathematical expression.

The current performance of transport models are not satisfying, and there have been many cases of erroneous transport forecasts (Boyce and Williams, 2015). Prediction accuracy in other fields, most notably in weather forecasting, has improved significantly over the years. This discrepancy in model performance between transport and weather forecasting suggests that there might be some fundamental issues with tools used by transport modelers. While travel demand models (Meyer and Miller, 2001) experienced no major changes for over half a century since their introduction, weather forecasting benefited from an open mind to explore new methods in modeling, and daily opportunities to validate new models against observations (Blum, 2019), which resulted in the adoption of ensemble models as a standard practice, and a significant improvement in forecast accuracy.

Ensemble forecasting serves two purposes: combining information, and pooling errors. The combination of information is achieved by considering different model assumptions, and pattern recognition methods, so information extracted by different models are combined with an ensemble model. The pooling of error is achieved by consulting multiple sources. It is very difficult for complex systems, such as models and computer software, to be totally void of errors; but since different models/sources are more likely to recognize the same feature than to repeat the same error, it is common in weather forecasting to obtain a value from different models or software (Blum, 2019, Silver, 2012). The same idea in pooling error is behind the 2014 Nobel Prize in chemistry, which was awarded to scientists using repeated imagery, that obtained resolution beyond the diffraction limit (Möckl et al., 2014)

This review and synthesis contributes to the literature with a wider scope than technical reviews of ensemble methods. Even in weather forecasting where ensemble methods originated, its ensemble methods are not comprehensive. This review covers both ensemble models that make a single simultaneous prediction, and iterative models that use model outputs as new inputs, where forecast uncertainties resulting from initial condition and accumulated error (i.e. chaos theory) tends to accumulate. These two types of ensemble models are generally discussed separately in different contexts, because historically, chaos theory and error accumulation were more specific to weather forecasting. The combination of different model formulations and assumptions are not a major interest in weather forecasting. However, these two types of ensemble models are both relevant to transport related cases, therefore the broad scope of this review is necessary for transport application of ensemble models. We also review ensemble methods used in different disciplines, which includes the use of expert opinions, judgmental adjustments to model predictions, and meta-analysis, covering both methodical, and what would be considered empirical, ensemble methods.

Ensemble forecasting combining different sources of uncertainties provides an alternative to the conventional modeling approach. Ensemble forecasting was perhaps first used in weather forecasts (Blum, 2019), and is intended to extract more information out of available data, and to incorporate uncertainties in modeling. The resulting ensemble models have higher accuracy, better reliability, and produce model outputs that are more useful as decision support tools. The defining characteristic of ensemble models is the combination of outputs from different models, and data from different sources. Philosophically this combination of data and models constitutes an aggregation of information, since different models can extract different pieces of information embedded within the data (Winkler, 1989); data from different sources also contain non-overlapping pieces of information, that can be combined by ensemble models.

Ensemble model outputs can include a range of possible outcomes from parallel base models, instead of a single number. Real-world events have many possibilities, to which models only provide an ‘estimate’ for what is likely to happen. In this light, different models rely on different assumptions, that provide different perspectives for prediction. The performance of models are measured in probabilities of being correct, so even the lowest performing model still has a small chance of being correct. The job of ensemble models is to incorporate these uncertainties into an ensemble forecast.

Transport modeling shares in the same uncertainties in data and in models as weather, economic, and political forecasting, and yet ensemble forecasting remains rare in transport. Through the use of ensemble models, different theories, different assumptions on data generation processes, and data from different sources with slight (or significant) variations can be combined to present multiple possibilities in an ensemble forecast. Ensemble forecasting provides an opportunity to improve transport models. In this paper we apply ensemble forecasting to a few transport related cases to test the performance of ensemble models.

There are ‘two cultures’ (Breiman et al., 2001) in modeling, namely theory-driven models, and data-driven models (terminology from Van Cranenburgh et al. (2021)). Theory-driven models have predetermined assumptions on the data generation process (e.g. linear, logit, etc.), and the model calibration aims to find parameters that suits the data. Data-driven models provide an alternative to theory-driven models, by not imposing assumptions on the data generation process in the same way that theory-driven models do, and instead focus on the data to extract and reproduce patterns in the data. The class of data-driven machine learning models can be further divided into generative and discriminatory modeling. The generative modeling attempts to learn the joint distribution of data, which can be used to generate new cases, and make predictions using Bayesian rules (Ng and Jordan, 2002); discriminatory modeling divides the data space, and discriminates cases directly based on explanatory variables. Each of the two cultures of modeling has its advantages, and with ensemble forecasting, these two cultures can be combined.

Methods of ensemble forecasting are highly siloed across disciplines, and often serve different purposes. For instance, in weather forecasting, ensemble models are used to plot possible paths of storms, and to dilute accumulated error over time; many other disciplines cite accuracy gain as the major motivation for using ensemble models. Bits and pieces of ensemble methods are used by different disciplines, without recognizing the big picture. The full potential of ensemble forecasting is not being realized, and many of its benefits do not cross disciplinary barriers. In this paper we review synthesize different parts of ensemble forecasting into a unified framework, with our addition of the ensembles of ensembles.

Section snippets

Review of ensemble methods

This section examines the literature for applications of ensemble models. The scope of this review covers ensemble models both within and outside of transport research, covering both methods of ensemble forecasting, and author-stated objectives of applying ensemble methods, to ascertain what types of ensemble models are in existence, and what these models are used for.

There are different levels of ensemble models. Degenerate forms of ensemble models that include only one model formulation are

Sources of forecast uncertainties

Uncertainties are unknown pieces of information not covered by forecast models. Models are best with the ‘known knowns’, which are strictly deterministic, and are better with risk (the ‘known unknowns’ to follow the Rumsfeld framework (Rumsfeld, 2011)) than uncertainty (the ‘unknown unknowns’), where risk can be quantified and measured but uncertainty cannot. Transport problems include significant uncertainties from various sources, which is the root cause of forecast errors. Different types of

Synthesis of ensemble methods

Within transport modeling, there are currently no systematic methods, or rules of thumb, for identifying suitable ensemble forecasting solutions for different scenarios. Ensemble forecasting generally does not work ‘out-of-the-box’, because each specific transport problem has its unique data availability, different sources of uncertainties, and requirements for forecast accuracy and reliability; different ensemble methods can also produce different types of ensemble model output, from a single

Application in transport related cases

In this section we test the performance of base models against ensemble models and ensemble of ensembles, in three transport related cases. Ensemble models generally require more data to calibrate than single models. Advances in technology and new methods of data collection provide continuous improvement in both the quantity and quality of data; the three transport related cases tested in this paper utilize recently available data that have sufficient quality, and enough data points in order to

Conclusion

Prevailing transport modeling practice relies heavily on finding the best single model, using that single model for forecasts, and presenting model outputs as a single number. This paper points out the folly in such practices, and summarizes problems with the single model procedure, as not considering real-world uncertainties in data generation mechanisms, measurement, and model specifications. The single model procedure also produces a gap between model outputs, which are deterministic, and

CRediT authorship contribution statement

Hao Wu: Conceptualization, Methodology, Writing - original draft. David Levinson: Supervision, Writing - review & editing.

References (96)

ChenXiqun Michael et al.
Understanding ridesplitting behavior of on-demand ride services: An ensemble learning approach
Transp. Res. C
(2017)
DalrympleDouglas J.
Sales forecasting practices: Results from a United States survey
Int. J. Forecast.
(1987)
DelenDursun et al.
Investigating injury severity risk factors in automobile crashes with predictive analytics and sensitivity analysis methods
J. Transp. Health
(2017)
GraefeAndreas et al.
Combining forecasts: An application to elections
Int. J. Forecast.
(2014)
LeutbecherMartin et al.
Ensemble forecasting
J. Comput. Phys.
(2008)
LoboGerald J.
Analysis and comparison of financial analysts’, time series, and combined forecasts of annual earnings
J. Bus. Res.
(1992)
MaZhenliang et al.
Predicting short-term bus passenger demand using a pattern hybrid approach
Transp. Res. C
(2014)
McNeesStephen K.
The role of judgment in macroeconomic forecasting accuracy
Int. J. Forecast.
(1990)
MilneDave et al.
Big data and understanding change in the context of planning transport systems
J. Transp. Geogr.
(2019)
PavlyukDmitry
Towards ensemble learning of traffic flows’ spatiotemporal structure
Transp. Res. Proc.
(2020)

RenLiqun et al.

An optimal neural network and concrete strength modeling

Adv. Eng. Softw.

(2002)

SangerTerence D.

Optimal unsupervised learning in a single-layer linear feedforward neural network

Neural Netw.

(1989)

ServiziValentino et al.

Stop detection for smartphone-based travel surveys using geo-spatial context and artificial neural networks

Transp. Res. C

(2020)

ShahriariM. et al.

Using the analog ensemble method as a proxy measurement for wind power predictability

Renew. Energy

(2020)

WeiYu et al.

Forecasting the short-term metro passenger flow with empirical mode decomposition and neural networks

Transp. Res. C

(2012)

WinklerRobert L.

Combining forecasts: A philosophical basis and some current issues

Int. J. Forecast.

(1989)

WolpertDavid H.

Stacked generalization

Neural Netw.

(1992)

XiaoYi et al.

Application of multiscale analysis-based intelligent ensemble modeling on airport traffic forecast

Transp. Lett.

(2015)

XingYang et al.

An ensemble deep learning approach for driver lane change intention inference

Transp. Res. C

(2020)

ZhouLigang et al.

Least squares support vector machines ensemble models for credit scoring

Expert Syst. Appl.

(2010)

AlonsoWilliam

Location and Land Use

(1964)

AndaCuauhtemoc et al.

Transport modelling in the age of big data

Int. J. Urban Sci.

(2017)

ArmstrongJ. Scott

Combining forecasts

AshtonAlison Hubbard et al.

Aggregating subjective forecasts: Some empirical results

Manage. Sci.

(1985)

Residential Property Transaction Data 2017 - 2019

(2019)

BaconRobert W.

Some evidence on the largest squared correlation coefficient from several samples

Econometrica

(1977)

BatesJohn M. et al.

The combination of forecasts

J. Oper. Res. Soc.

(1969)

BlumAndrew

BoyceDavid E. et al.

Forecasting Urban Travel: Past, Present and Future

(2015)

BreimanLeo

Stacked regressions

Mach. Learn.

(1996)

BreimanLeo

Heuristics of instability and stabilization in model selection

Ann. Statist.

(1996)

BreimanLeo

Statistical modeling: The two cultures

Statist. Sci.

(2001)

ChandNanak et al.

A comparative analysis of SVM and its stacking with other classification algorithm for intrusion detection

ChengLong et al.

Applying an ensemble-based model to travel choice behavior in travel demand forecasting under uncertainties

Transp. Lett.

(2019)

Transportation Network Providers - Trips

(2019)

Cowgill, Bo, Dell’Acqua, Fabrizio, Deng, Samuel, Hsu, Daniel, Verma, Nakul, Chaintreau, Augustin, 2020. Biased...

DawesRobyn M.

The robust beauty of improper linear models in decision making

Am. Psychol.

(1979)

Delle MonacheLuca et al.

Probabilistic weather prediction with an analog ensemble

Mon. Weather Rev.

(2013)

The Ensemble Prediction System

(2013)

ElliottGraham

Averaging and the Optimal Combination of Forecasts

(2011)

FamaEugene F.

Efficient market hypothesis

(1960)

GarrisonWilliam L. et al.

The Transportation Experience: Policy, Planning, and Deployment

(2014)

GustafssonNils

Statistical issues in weather forecasting

Scand. J. Stat.

(2002)

HaiderMurtaza

Diminishing returns to density and public transit

Transp. Findings

(2019)

HongLu et al.

Groups of diverse problem solvers can outperform groups of high-ability problem solvers

Proc. Natl. Acad. Sci.

(2004)

JiAng et al.

Injury severity prediction from two-vehicle crash mechanisms with machine learning and ensemble models

IEEE Open J. Intell. Transp. Syst.

(2020)

KahnemanDaniel

Thinking, Fast and Slow

(2011)

KangHeejoon

Unstable weights in the combination of forecasts

Manage. Sci.

(1986)

Cited by (37)

A machine learning method based on stacking heterogeneous ensemble learning for prediction of indoor humidity of greenhouse
2024, Journal of Agriculture and Food Research
Efficient production management, high productivity, and improved product quality are essential for the success of greenhouse production in producing sustainable agricultural products. Several environmental factors, including air temperature, humidity, CO2 levels, and light levels, have a major influence on this. Managing internal humidity is critical to preventing climate variation, disease, and pests in glasshouses that can cause significant damage if not properly controlled. This article assesses the performance of machine learning models in predicting indoor humidity levels in a greenhouse using a dataset from Guilan University's greenhouse located in Rasht City, Iran. Seven regression models were used to make predictions: multiple linear regression (MR), polynomial regression (PR), decision tree regression (DT), k-nearest neighbors regression (KNN), support vector regression (SVR), random forest regression (RF), and extreme gradient boosting regression (XGBoost). Evaluation criteria including coefficient of determination ( $R^{2}$ ), mean absolute error (MAE), mean square error (MSE), and root mean square error (RMSE) were used to evaluate each model. The best machine learning models were selected based on these criteria values ( $R^{2}$ > 0.94) and combined using the stacking method, a popular ensemble learning technique, to create a metamodel for accurately predicting internal humidity within the greenhouse. The metamodel showed exceptional performance, with significantly improved evaluation criteria on the test dataset, specifically $R^{2}$ of 0.96515, MAE of 0.01395, MSE of 0.03205, and RMSE of 0.00102.
The node-place model, accessibility, and station level transit ridership
2023, Journal of Transport Geography
This paper uses Sydney rail data to examine the relationship between station level ridership and local and regional accessibility. We use net transit accessibility, which is the additional number of opportunities reachable by transit over walking to represent the regional connectivity value provided by transit. We map accessibility at transit stations, and use the number of opportunities within walking distance as an indicator of local access. We find elements of place (or local) access, including access to jobs and to residents within walking distance (local access), and nodal (or regional) access, including transit access to distant jobs and residential locations are both significant indicators of station level ridership. In particular, the number of jobs within walking distance of a transit station is the best single predictor of transit ridership. This paper highlights the importance of high density around station areas for transit ridership.
A dynamic ensemble learning with multi-objective optimization for oil prices prediction
2022, Resources Policy
Citation Excerpt :
Li et al. (2019a) provided a hybrid forecasting model with the variational mode decomposition and artificial intelligence methods for monthly oil price. Numerous research results have shown that the ensemble strategies have significant impact on the improvement of model prediction accuracy (Ahmad et al., 2021; Li et al., 2021b, 2021c; Liu et al., 2021; Wu and Levinson, 2021; Yu et al., 2022). At present, the commonly used integration methods can be divided into static (fixed weight) integration strategies and dynamic (variable weight) integration strategies (Alameer et al., 2020; Bueno et al., 2020; Chen and Liu, 2021).
Accurately predicting oil prices is a challenging task since its complex fluctuation characteristics. This paper innovatively introduces the “metabolism” mechanism and sliding window technology and proposes a dynamic time-varying weight ensemble prediction model with multi-objective programming to ameliorate the oil price's prediction performance. This paper first adopts the random forest to select and generate the best feature sets. Second, different individual models are selected to build a heterogeneous ensemble prediction framework. Then, a multi-objective weight generation model is established by considering horizontal and directional accuracy. Moreover, the nondominated sorting genetic algorithm-II is utilized to compute the prediction errors of a single model at different stages and achieve model optimization selection and ensemble weight generation. Finally, we take Brent and WTI oil prices as the prediction objects to verify the effectiveness and superiority of the proposed model. The experimental results reveal that the dynamic time-varying weight ensemble forecasting model has excellent prediction capability for oil prices and can become an effective forecasting tool.
A hybrid approach for portfolio construction: Combing two-stage ensemble forecasting model with portfolio optimization
2024, Computational Intelligence
A Large Scale Digital Twin for British Cities
2024, Research Square
Future climate projection across Tanzania under CMIP6 with High-Resolution Regional Climate Model
2024, Research Square

View all citing articles on Scopus

View full text

The ensemble approach to forecasting: A review and synthesis

Highlights

Abstract

Introduction

Section snippets

Review of ensemble methods

Sources of forecast uncertainties

Synthesis of ensemble methods

Application in transport related cases

Conclusion

CRediT authorship contribution statement

Transp. Res. C

Int. J. Forecast.

J. Transp. Health

Int. J. Forecast.

J. Comput. Phys.

J. Bus. Res.

Transp. Res. C

Int. J. Forecast.

J. Transp. Geogr.

Transp. Res. Proc.

Adv. Eng. Softw.

Neural Netw.

Transp. Res. C

Renew. Energy

Transp. Res. C

Int. J. Forecast.

Neural Netw.

Transp. Lett.

Transp. Res. C

Expert Syst. Appl.

Location and Land Use

Transport modelling in the age of big data

Int. J. Urban Sci.

Combining forecasts

Aggregating subjective forecasts: Some empirical results

Manage. Sci.

Residential Property Transaction Data 2017 - 2019

Some evidence on the largest squared correlation coefficient from several samples

Econometrica

The combination of forecasts

J. Oper. Res. Soc.

Forecasting Urban Travel: Past, Present and Future

Stacked regressions

Mach. Learn.

Heuristics of instability and stabilization in model selection

Ann. Statist.

Statistical modeling: The two cultures

Statist. Sci.

A comparative analysis of SVM and its stacking with other classification algorithm for intrusion detection

Applying an ensemble-based model to travel choice behavior in travel demand forecasting under uncertainties

Transp. Lett.

Transportation Network Providers - Trips

The robust beauty of improper linear models in decision making

Am. Psychol.

Probabilistic weather prediction with an analog ensemble

Mon. Weather Rev.

The Ensemble Prediction System

Averaging and the Optimal Combination of Forecasts

Efficient market hypothesis

The Transportation Experience: Policy, Planning, and Deployment

Statistical issues in weather forecasting

Scand. J. Stat.

Diminishing returns to density and public transit

Transp. Findings

Groups of diverse problem solvers can outperform groups of high-ability problem solvers

Proc. Natl. Acad. Sci.

Injury severity prediction from two-vehicle crash mechanisms with machine learning and ensemble models

IEEE Open J. Intell. Transp. Syst.

Thinking, Fast and Slow

Unstable weights in the combination of forecasts

Manage. Sci.