当前位置: X-MOL 学术Transp. Res. Part A Policy Pract. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Comparing and contrasting choice model and machine learning techniques in the context of vehicle ownership decisions
Transportation Research Part A: Policy and Practice ( IF 6.4 ) Pub Date : 2023-06-06 , DOI: 10.1016/j.tra.2023.103727
Azam Ali , Arash Kalatian , Charisma F. Choudhury

In recent years, planners have started considering Machine Learning (ML) techniques as an alternative to discrete choice models (CM). ML techniques are primarily data-driven and typically achieve better prediction accuracy compared to CM. However, it is hypothesized that since the ML techniques do not have the strong grounding to economic theory as the CMs, they may not perform well in contexts that are radically different from the ‘training’ scenario. It is also hypothesized that the relative prediction performance may be affected by the metrics used for comparing the models.

This research aims to test these two hypotheses empirically by modelling vehicle ownership choices using household survey data from Dhaka, Bangladesh collected in 2004, 2010 and 2019. The performances of CM (multinomial logit) and ML techniques (neural networks and gradient boosting trees) have been compared using log-likelihood and mean absolute percentage error of market shares.

The results indicate that the multinomial logit model (MNL) with a piecewise linear transformation of the household income, has the best performance in terms of log-likelihood and mean absolute percentage error of market shares. This is followed by Neural Networks (NN) and Gradient Boosting Trees (GBT). The results thus provide empirical evidence that the ML techniques do not consistently outperform CM. Moreover, the difference in the performance of the models further increases if the prediction scenario is substantially different. This reinforces the hypothesis that CMs, with their behavioural underpinning, are better suited for long-term forecasting than data-driven ML approaches, especially if the population and network attributes are expected to change substantially. These findings will be useful for planners and policy makers in the selection of the appropriate tool for forecasting travel demand.



中文翻译:

在车辆所有权决策的背景下比较和对比选择模型和机器学习技术

近年来,规划人员开始考虑使用机器学习 (ML) 技术来替代离散选择模型 (CM)。ML 技术主要是数据驱动的,与 CM 相比,通常可以实现更好的预测准确性。然而,据推测,由于 ML 技术不像 CM 那样具有强大的经济理论基础,因此它们在与“训练”场景截然不同的环境中可能表现不佳。还假设相对预测性能可能会受到用于比较模型的指标的影响。

本研究旨在通过使用 2004 年、2010 年和 2019 年收集的来自孟加拉国达卡的家庭调查数据对车辆所有权选择进行建模,从经验上检验这两个假设。CM(多项式 logit)和 ML 技术(神经网络和梯度提升树)的性能具有使用对数似然和市场份额的平均绝对百分比误差进行比较。

结果表明,对家庭收入进行分段线性变换的多项式 Logit 模型 (MNL) 在对数似然和市场份额的平均绝对百分比误差方面具有最佳性能。其次是神经网络 (NN) 和梯度提升树 (GBT)。因此,结果提供了 ML 技术并不总是优于 CM 的经验证据。此外,如果预测场景有很大不同,模型性能的差异会进一步增加。这强化了这样一种假设,即具有行为基础的 CM 比数据驱动的 ML 方法更适合长期预测,尤其是在人口和网络属性预计会发生重大变化的情况下。

更新日期:2023-06-07
down
wechat
bug