当前位置: X-MOL 学术Transp. Res. Part C Emerg. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Generalized model for mapping bicycle ridership with crowdsourced data
Transportation Research Part C: Emerging Technologies ( IF 8.3 ) Pub Date : 2021-02-27 , DOI: 10.1016/j.trc.2021.102981
Trisalyn Nelson , Avipsa Roy , Colin Ferster , Jaimy Fischer , Vanessa Brum-Bastos , Karen Laberee , Hanchen Yu , Meghan Winters

Fitness apps, such as Strava, are a growing source of data for mapping bicycling ridership, due to large samples and high resolution. To overcome bias introduced by data generated from only fitness app users, researchers build statistical models that predict total bicycling by integrating Strava data with official counts and geographic data. However, studies conducted on single cities provide limited insight on best practices for modeling bicycling with Strava as generalizability is difficult to assess. Our goal is to develop a generalized approach to modeling bicycling ridership using Strava data. In doing so we enable detailed mapping that is more inclusive of all bicyclists and will support more equitable decision-making across cities. We used Strava data, official counts, and geographic data to model Average Annual Daily Bicycling (AADB) in five cities: Boulder, Ottawa, Phoenix, San Francisco, and Victoria. Using a machine learning approach, LASSO, we identify variables important for predicting ridership in all cities, and independently in each city. Using the LASSO-selected variables as predictors in Poisson regression, we built generalized and city-specific models and compared accuracy. Our results indicate generalized prediction of bicycling ridership on a road segment in concert with Strava data should include the following variables: number of Strava riders, percentage of Strava trips categorized as commuting, bicycling safety, and income. Inclusion of city-specific variables increased model performance, as the R2 for generalized and city-specific models ranged from 0.08–0.80 and 0.68–0.92, respectively. However, model accuracy was influenced most by the official count data used for model training. For best results, official count data should capture diverse street conditions, including low ridership areas. Counts collected continuously over a long time period, rather than at peak periods, may also improve modeling. Modeling bicycling from Strava and geographic data enables mapping of bicycling ridership that is more inclusive of all bicyclists and better able to support decision-making.



中文翻译:

利用众包数据映射自行车出行的通用模型

健身应用程序(例如Strava)由于样本量大且分辨率高,正在成为用于绘制骑车出行图的越来越多的数据源。为了克服仅由健身应用程序用户生成的数据带来的偏见,研究人员通过将Strava数据与官方计数和地理数据相集成,构建了预测总自行车骑行的统计模型。但是,由于很难评估普遍性,因此在单个城市进行的研究对使用Strava进行自行车骑行建模的最佳做法的了解有限。我们的目标是使用Strava数据开发一种通用的方法来对骑车出行进行建模。这样一来,我们便可以进行详细的地图绘制,将所有骑自行车的人都包括在内,并为整个城市的决策提供更公正的支持。我们使用Strava数据,官方数据,和地理数据来模拟五个城市(博尔德,渥太华,凤凰城,旧金山和维多利亚)的年平均每日骑自行车(AADB)。使用LASSO这样的机器学习方法,我们确定了对预测所有城市的乘车率至关重要的变量,并且每个城市的变量均独立。使用LASSO选择的变量作为Poisson回归中的预测变量,我们建立了广义的和特定于城市的模型,并比较了准确性。我们的结果表明,根据Strava数据,对道路段上骑自行车出行的总体预测应包括以下变量:Strava骑手的人数,归类为通勤的Strava出行百分比,骑车安全性和收入。包含城市特定变量可提高模型性能,因为R LASSO,我们确定了对预测所有城市的乘车率至关重要的变量,并且每个城市的变量均独立。使用LASSO选择的变量作为Poisson回归中的预测变量,我们建立了广义的和特定于城市的模型,并比较了准确性。我们的结果表明,根据Strava数据,对道路段上骑自行车出行的总体预测应包括以下变量:Strava骑手的人数,归类为通勤的Strava出行百分比,骑车安全性和收入。包含城市特定变量可提高模型性能,因为R LASSO,我们确定了对预测所有城市的乘车率至关重要的变量,并且每个城市的变量均独立。使用LASSO选择的变量作为Poisson回归中的预测变量,我们建立了广义的和特定于城市的模型,并比较了准确性。我们的结果表明,结合Strava数据,对道路段上骑自行车出行的总体预测应包括以下变量:Strava骑手的数量,归类为通勤的Strava出行百分比,骑车安全性和收入。包含城市特定变量可提高模型性能,因为R 我们的结果表明,根据Strava数据,对道路段上骑自行车出行的总体预测应包括以下变量:Strava骑手的人数,归类为通勤的Strava出行百分比,骑车安全性和收入。包含城市特定变量可提高模型性能,因为R 我们的结果表明,根据Strava数据,对道路段上骑自行车出行的总体预测应包括以下变量:Strava骑手的人数,归类为通勤的Strava出行百分比,骑车安全性和收入。包含城市特定变量可提高模型性能,因为R广义模型和城市模型的2分别在0.08–0.80和0.68–0.92之间。但是,模型准确性受模型训练所用的官方计数数据影响最大。为了获得最佳结果,官方计数数据应涵盖各种街道状况,包括低载客量区域。在很长一段时间内(而不是高峰时段)连续收集的计数也可以改善建模。通过Strava和地理数据对骑自行车进行建模,可以绘制出骑车出行图,该图将所有骑车者都包括在内,并且能够更好地支持决策。

更新日期:2021-02-28
down
wechat
bug