当前位置: X-MOL 学术Eur. J. Remote Sens. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A rule-based model for Seoul Bike sharing demand prediction using weather data
European Journal of Remote Sensing ( IF 3.7 ) Pub Date : 2020-02-13 , DOI: 10.1080/22797254.2020.1725789
Sathishkumar V E 1 , Yongyun Cho 1
Affiliation  

ABSTRACT

This research paper presents a rule-based regression predictive model for bike sharing demand prediction. In recent days, Pubic rental bike sharing is becoming popular because of is increased comfortableness and environmental sustainability. Data used include Seoul Bike and Capital Bikeshare program data. Both data have weather data associated with it for each hour. For both the dataset, five statistical models were trained with optimized hyperparameters using a repeated cross validation approach and testing set is used for evaluation: (a) CUBIST (b) Regularized Random Forest (c) Classification and Regression Trees (d) K Nearest Neighbour (e) Conditional Inference Tree. Multiple evaluation indices such as R2, Root Mean Squared Error, Mean Absolute Error and Coefficient of Variation were used to measure the prediction performance of the regression models. The results show that the rule-based model CUBIST was able to explain about 95 and 89% of the Variance (R2) in the testing set of Seoul Bike data and Capital Bikeshare program data respectively. An analysis with variable importance was carried to analyse the most significant variables for all the models developed with the two datasets considered. The variable importance results have shown that Temperature and Hour of the day are the most influential variables in the hourly rental bike demand prediction.



中文翻译:

基于规则的首尔自行车使用天气数据共享需求预测模型

摘要

本研究论文提出了一种基于规则的自行车共享需求预测的回归预测模型。近年来,由于提高舒适度和环境可持续性,公共自行车租赁共享变得越来越流行。使用的数据包括首尔自行车和首都自行车共享计划数据。这两个数据均具有每小时相关的天气数据。对于这两个数据集,使用重复交叉验证方法使用优化的超参数训练了五个统计模型,并使用测试集进行评估:(a)CUBIST(b)正规化随机森林(c)分类和回归树(d)K最近邻(e)有条件的推理树。多个评估指标,例如R 2,均方根误差,均值绝对误差和变异系数用于衡量回归模型的预测性能。结果表明,基于规则的模型CUBIST能够分别解释汉城自行车数据和Capital Bikeshare程序数据的测试集中的方差(R 2)的95%和89%。进行了具有可变重要性的分析,以分析考虑了两个数据集的所有模型的最重要变量。重要性的变量结果表明,“小时”和“一天中的小时”是每小时租赁自行车需求预测中影响最大的变量。

更新日期:2020-02-13
down
wechat
bug