当前位置: X-MOL 学术Proc. Inst. Civ. Eng. Munic. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Interpretable machine-learning models for estimating trip purpose in smart card data
Proceedings of the Institution of Civil Engineers - Municipal Engineer ( IF 1.0 ) Pub Date : 2021-06-16 , DOI: 10.1680/jmuen.20.00003
Eui-Jin Kim 1 , Youngseo Kim 1 , Dong-Kyu Kim 1
Affiliation  

Investigating trip purposes of transit passengers is crucial in assessing current urban transportation systems and prioritising investments in the public transportation infrastructure. Smart card data provide day-to-day information on passengers’ boardings and alightings, but the lack of information on trip purposes leads to restrictions on the use of these data. This paper focuses on estimating trip purposes of transit passengers in smart card data, using a machine-learning model that is trained by household travel survey data. To accomplish this objective, a random forest model coupled with interpretable machine-learning methods – that is, feature importance, feature interactions and accumulated local effects plot is proposed. This approach can be used to estimate trip purposes and to explain the decision-making process of the models. The models include the spatiotemporal features that can be extracted from both the smart card data and the geographic information data, which can be collected sustainably and cost-effectively. The proposed model achieves an 83% overall accuracy in its estimation of the validation data. The interpretation methods show that temporal features are the dominant factors in estimating the purposes of trips, and the spatial features influence the estimates mainly through cross-effects with the temporal features.

中文翻译:

用于估计智能卡数据中的旅行目的的可解释机器学习模型

调查过境乘客的出行目的对于评估当前的城市交通系统和优先投资公共交通基础设施至关重要。智能卡数据提供有关乘客上下车的日常信息,但缺乏旅行目的信息导致这些数据的使用受到限制。本文重点使用由家庭旅行调查数据训练的机器学习模型来估计智能卡数据中过境乘客的旅行目的。为了实现这一目标,随机森林模型与可解释的机器学习方法相结合——即特征重要性、特征相互作用和累积局部效应图被提出。这种方法可用于估计出行目的并解释模型的决策过程。这些模型包括可以从智能卡数据和地理信息数据中提取的时空特征,可以可持续地、经济地收集这些数据。所提出的模型在其对验证数据的估计中实现了 83% 的总体准确率。解释方法表明,时间特征是估计出行目的的主导因素,空间特征主要通过与时间特征的交叉效应影响估计。
更新日期:2021-06-16
down
wechat
bug