当前位置: X-MOL 学术Transp. Res. Part C Emerg. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Scalable low-rank tensor learning for spatiotemporal traffic data imputation
Transportation Research Part C: Emerging Technologies ( IF 8.3 ) Pub Date : 2021-06-09 , DOI: 10.1016/j.trc.2021.103226
Xinyu Chen , Yixian Chen , Nicolas Saunier , Lijun Sun

Missing value problem in spatiotemporal traffic data has long been a challenging topic, in particular for large-scale and high-dimensional data with complex missing mechanisms and diverse degrees of missingness. Recent studies based on tensor nuclear norm have demonstrated the superiority of tensor learning in imputation tasks by effectively characterizing the complex correlations/dependencies in spatiotemporal data. However, despite the promising results, these approaches do not scale well to large data tensors. In this paper, we focus on addressing the missing data imputation problem for large-scale spatiotemporal traffic data. To achieve both high accuracy and efficiency, we develop a scalable tensor learning model—Low-Tubal-Rank Smoothing Tensor Completion (LSTC-Tubal)—based on the existing framework of Low-Rank Tensor Completion, which is well-suited for spatiotemporal traffic data that is characterized by multidimensional structure of location × time of day × day. In particular, the proposed LSTC-Tubal model involves a scalable tensor nuclear norm minimization scheme by integrating linear unitary transformation. Therefore, tensor nuclear norm minimization can be solved by singular value thresholding on the transformed matrix of each day while the day-to-day correlation can be effectively preserved by the unitary transform matrix. Before setting up the experiment, we consider some real-world data sets, including two large-scale 5-min traffic speed data sets collected by the California PeMS system with 11160 sensors: 1) PeMS-4W covers the data over 4 weeks (i.e., 288×28 time points), and 2) PeMS-8W covers the data over 8 weeks (i.e., 288×56 time points). We compare LSTC-Tubal with some state-of-the-art baseline models, and find that LSTC-Tubal can achieve competitively accuracy with a significantly lower computational cost. In addition, the LSTC-Tubal will also benefit other tasks in modeling large-scale spatiotemporal traffic data, such as network-level traffic forecasting.



中文翻译:

用于时空交通数据插补的可扩展低秩张量学习

时空交通数据中的缺失值问题长期以来一直是一个具有挑战性的课题,特别是对于缺失机制复杂且缺失程度不同的大规模高维数据。最近基于张量核范数的研究通过有效表征时空数据中复杂的相关性/依赖性,证明了张量学习在插补任务中的优越性。然而,尽管结果很有希望,但这些方法不能很好地扩展到大数据张量。在本文中,我们专注于解决大规模时空交通数据的缺失数据插补问题。为了实现高精度和高效率,我们开发了一个可扩展的张量学习模型——Low-Tubal-Rank Smoothing Tensor Completion (LSTC-Tubal)——基于现有的 Low-Rank Tensor Completion 框架,× 一天中的时间 ×日。特别是,所提出的 LSTC-Tubal 模型通过集成线性幺正变换涉及可扩展的张量核范数最小化方案。因此,张量核范数最小化可以通过对每一天的变换矩阵进行奇异值阈值处理来解决,而日常相关性可以通过酉变换矩阵有效地保持。在设置实验之前,我们考虑了一些真实世界的数据集,包括两个由加州 PeMS 系统收集的具有 11160 个传感器的大规模 5 分钟交通速度数据集:1)PeMS-4W 覆盖了 4 周以上的数据(即,288×28 时间点),以及 2)PeMS-8W 涵盖了超过 8 周的数据(即, 288×56时间点)。我们将 LSTC-Tubal 与一些最先进的基线模型进行比较,发现 LSTC-Tubal 可以以显着降低的计算成本实现具有竞争力的准确性。此外,LSTC-Tubal 还将有益于建模大规模时空交通数据的其他任务,例如网络级交通预测。

更新日期:2021-06-10
down
wechat
bug