当前位置: X-MOL 学术Transp. Res. Part C Emerg. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Discovering latent activity patterns from transit smart card data: A spatiotemporal topic model
Transportation Research Part C: Emerging Technologies ( IF 7.6 ) Pub Date : 2020-05-17 , DOI: 10.1016/j.trc.2020.102627
Zhan Zhao , Haris N. Koutsopoulos , Jinhua Zhao

Although automatically collected human travel records can accurately capture the time and location of human movements, they do not directly explain the hidden semantic structures behind the data, e.g., activity types. This work proposes a probabilistic topic model, adapted from Latent Dirichlet Allocation (LDA), to discover representative and interpretable activity categorization from individual-level spatiotemporal data in an unsupervised manner. Specifically, the activity-travel episodes of an individual user are treated as words in a document, and each topic is a distribution over space and time that corresponds to certain type of activity. The model accounts for a mixture of discrete and continuous attributes—the location, start time of day, start day of week, and duration of each activity episode. The proposed methodology is demonstrated using pseudonymized transit smart card data from London, U.K. The results show that the model can successfully distinguish the three most basic types of activities—home, work, and other. As the specified number of activity categories increases, more specific subpatterns for home and work emerge, and both the goodness of fit and predictive performance for travel behavior improve. This work makes it possible to enrich human mobility data with representative and interpretable activity patterns without relying on predefined activity categories or heuristic rules.



中文翻译:

从公交智能卡数据中发现潜在的活动模式:时空主题模型

尽管自动收集的人类旅行记录可以准确地捕获人类活动的时间和位置,但是它们并不能直接解释数据背后的隐藏语义结构,例如活动类型。这项工作提出了一个概率主题模型,该模型适用于潜在的狄利克雷分配(LDA),可以以无监督的方式从个体水平的时空数据中发现具有代表性和可解释的活动分类。具体而言,将单个用户的活动旅行情节视为文档中的单词,并且每个主题都是对应于某些活动类型的时空分布。该模型考虑了离散和连续属性的混合情况-位置,一天的开始时间,一周的开始日期以及每个活动情节的持续时间。使用来自英国伦敦的假名化过境智能卡数据演示了所建议的方法。结果表明,该模型可以成功地区分三种最基本的活动类型:家庭,工作和其他。随着指定的活动类别数量增加,出现了针对家庭和工作的更具体的子模式,并且对于出行行为的适应性和预测性能均得到改善。这项工作使得有可能通过具有代表性和可解释性的活动模式来丰富人类流动性数据,而无需依赖预定义的活动类别或启发式规则。出现了针对家庭和工作的更具体的子模式,并且适应性和旅行行为预测性能均得到改善。这项工作使得有可能通过具有代表性和可解释性的活动模式来丰富人类流动性数据,而无需依赖预定义的活动类别或启发式规则。出现了针对家庭和工作的更具体的子模式,并且适应性和旅行行为预测性能均得到改善。这项工作使得有可能通过具有代表性和可解释性的活动模式来丰富人类流动性数据,而无需依赖预定义的活动类别或启发式规则。

更新日期:2020-05-17
down
wechat
bug