Real-time spatiotemporal prediction and imputation of traffic status based on LSTM and Graph Laplacian regularized matrix factorization

https://doi.org/10.1016/j.trc.2021.103228Get rights and content

Highlights

  • This paper addresses the missing data online prediction and imputation problems.

  • Develop an LSTM-GL-ReMF model that incorporates spatiotemporal regularizations.

  • Extend the model to a tensor version LSTM-GL-ReTF following CP decomposition method.

Abstract

Accurate prediction of traffic status in real time is critical for advanced traffic management and travel navigation guidance. There are many attempts to predict short-term traffic flows using various deep learning algorithms. Most existing prediction models are only tested on spatiotemporal data assuming no missing data entries. However, this ideal situation rarely exists in real world due to sensor or network transmission failure. Missing data is a nonnegligible problem. Previous studies either remove time series with missing entries or impute missing data before building prediction models. The former may cause insufficient data for model training, while the latter adds extra computational burden and the imputation accuracy has direct impacts on the prediction performance. In this study, we propose an online framework that can make spatiotemporal predictions based on raw incomplete data and impute possible missing values at the same time. We design a novel spatial and temporal regularized matrix factorization model, namely LSTM-GL-ReMF, as the key component of the framework. The Long Short-term Memory (LSTM) model is chosen as the temporal regularizer to capture temporal dependency in time series data and the Graph Laplacian (GL) serves as the spatial regularizer to utilize spatial correlations among network sensors to enhance prediction and imputation performance. The proposed framework integrating with the LSTM-GL-ReMF model are tested and compared with other state-of-the-art matrix factorization models and deep learning models on three uni-variate and multi-variate spatiotemporal traffic datasets. The experimental results show our approach has a robust and accurate performance in terms of prediction and imputation accuracy under various data missing scenarios.

Introduction

Spatiotemporal data are a collection of spatially correlated time series such as traffic volume data and speed data collected from multiple sensors at a road network. In the era of big data, the dimension and scale of spatiotemporal data are rapidly expanding with the growth of the deployed sensor types and quantities. These large-scale spatiotemporal data can be utilized to provide wealthy information for decision support in constructing smart city. For example, online spatiotemporal traffic status prediction is critical in many intelligent transportation system (ITS) applications such as real-time adaptive traffic signal control system (Mirchandani and Head, 2001), vehicle navigation system (Lee et al., 2006) and predictive bus control framework (Andres and Nair, 2017). As another example, spatiotemporal prediction in air quality can also help with environment policy making (Paltsev et al., 2005). Hence, this topic has attracted wide attention in the past decade (Tong et al., 2017, Zhang et al., 2019, Che et al., 2018, Lin et al., 2018, Cui et al., 2019, Lin et al., 2015).

Time series prediction models mainly consider temporal correlations in individual time series such as Autoregressive model (AR) (Yule, 1926) and Autoregressive Integrated Moving Average (ARIMA) (Stellwagen and Tashman, 2013, Lin et al., 2013). Deep learning models such as Long short-term memory (LSTM) (Hochreiter and Schmidhuber, 1997) and Gated recurrent neural network (GRU) (Chung et al., 2014) that can learn and preserve long term temporal correlations have also been applied for time series prediction. In contrast, there are also spatial correlations in spatiotemporal data except the temporal dependency. For example, data collected by nearby loop sensors tend to have similar temporal characteristics. Hence, spatiotemporal prediction models aim to utilize spatiotemporal correlations among sensors to improve model performance. As one of the emerging research areas, graph convolutional networks (GCN) that can capture the spatial correlations among sensors through adjacency matrix have shown promising results in spatiotemporal traffic state prediction (Lin et al., 2018, Cui et al., 2019).

Most existing studies mainly work on spatiotemporal data assuming no missing data entries (Chung et al., 2014, Yule, 1926, Stellwagen and Tashman, 2013, Lin et al., 2018, Cui et al., 2019). However, in reality, due to factors such as unstable power grids, sensor failures, data transmission network failures and periodic equipment overhauls, spatiotemporal data often suffer from the data missing issue. For example, the freeway Performance Measurement System (PeMS) maintained by the California Department of Transportation (Caltrans) has a missing sample rate of about 15% (Chen et al., 2003). The Los Angeles County freeway speed dataset METR-LA has a data missing rate of about 8.11% (Cui et al., 2020). There are basically two data missing patterns, namely the point-wise data missing(PM) where data is randomly lost at discrete time slots and continuous data missing (CM) where data is lost for a continuous period. Fig. 1 illustrates the METR-LA speed data collected by a loop sensor in March 3rd. Point-wise missing (PM) entries are marked with hollow dots. And continuous missing entries are marked with yellow panels.

There are several approaches to overcome the data missing issue in spatiotemporal prediction: omitting the missing entries, imputing data first before building prediction models (Hu et al., 2017), forward-fill that use predicted values or last observations to fill in missing entries along with prediction making (Cui et al., 2020), Bayesian modeling that treat massing data as random variables and infer them from their corresponding conditional distributions (Sun and Chen, 2019, Ma and Chen, 2018) and dynamic factor models such as state-space models (Zhang et al., 2014) and matrix factorization models (Yu et al., 2016). The removal of missing values may lead to insufficient data for model training, and cause potential overfitting. Except for a few models such as those based on Bayes and probability, the data imputation process before model building would introduce extra computational burden for most deep learning models. Further, a data imputation model with low accuracy will also jeopardize the spatiotemporal prediction accuracy. Fill in missing entries using previous predicted values could lead to severe error accumulation, especially when data is lost for a continuous long time. In addition, the MCMC sampling in Bayesian approaches do not run quickly on large datasets and may not converge easily for complex models (Ma and Chen, 2018).

Matrix factorization (MF) models have been applied for spatiotemporal data imputation (Salakhutdinov and Mnih, 2008, Lee and Seung, 2001) and prediction (Yu et al., 2016, Gultekin and Paisley, 2019, Sun and Chen, 2019) for partially observed data. MF models can capture global spatial correlations among sensors in the factorization process as well as temporal correlations through linear autoregressive (AR) regularizer (Yu et al., 2016, Gultekin and Paisley, 2019, Sun and Chen, 2019). In this study, we propose a framework based on MF which is able to make spatiotemporal predictions using raw incomplete data and perform online data imputation with real-time data collection simultaneously. We innovatively design a spatial and temporal regularized matrix factorization model, namely LSTM-GL-ReMF, as the key component of the framework. In LSTM-GL-ReMF, its temporal regularizer depends on the state-of-the-art Long Short-term Memory (LSTM) model (Hochreiter and Schmidhuber, 1997), and the spatial regularizer is designed based on Graph Laplacian (GL) spatial regularization (Cai et al., 2010). These regularizers enable the incorporation of complex spatial and temporal dependence into matrix factorization process for more accurate online prediction and imputation performance.

The proposed framework is tested on two spatiotemporal traffic datasets and a high-dimensional spatiotemporal air pollutant dataset, namely the Seattle Traffic Speed dataset, Metr-LA Freeway Speed dataset and Shanghai Pollutant Concentration dataset. Experiments demonstrate the proposed LSTM-GL-ReMF framework outperforms seven benchmark models, which include MF models TRMF (Yu et al., 2016) and BTMF (Sun and Chen, 2019), state-of-the-art deep learning models LSTM (Hochreiter and Schmidhuber, 1997), GRU-D (Che et al., 2018), GCN-DDGF (Lin et al., 2018), TGC-LSTM (Cui et al., 2019) and SGMN (Cui et al., 2020), in both prediction and imputation accuracy under various data missing scenarios. The main contributions of this paper are summarized as follows:

  • We propose a novel LSTM-GL-ReMF model that captures nonlinear temporal dynamics and ensures the local spatial smoothness of spatiotemporal data, and then extend it to multi-variate version using tensor CP decomposition method.

  • We propose an effective and efficient alternating method to solve the LSTM and Graph Laplacian regularized matrix factorization problem.

  • We extensively compare LSTM-GL-ReMF with other state-of-the-art MF and deep learning models on three traffic datasets under various data missing scenarios, and show our model has robust and accurate performance.

The remainder of this paper is organized as follows. Section 2 introduces related work of this study. In Section 3, we introduce the proposed LSTM-GL-ReMF model and the online prediction and imputation framework. Section 4 introduces the experimental results conducted on the two spatiotemporal datasets under various data missing scenarios. Section 5 summarizes the study and discuss future research directions.

Section snippets

Related work

This section will first introduce relevant works in spatiotemporal prediction, then discuss previous prediction models based on incomplete data. Finally we will focus on matrix factorization studies that have been applied in both data imputation and prediction.

Methodology

In this section, we will first briefly introduce Temporal Regularized Matrix Factorization model (TRMF) (Yu et al., 2016), which serves as a basis to understand our methodology. Then we introduce the proposed LSTM and Graph Laplacian Regularized Matrix Factorization (LSTM-GL-ReMF) model and the online spatiotemporal data prediction and imputation framework.

Experiments

To demonstrate the performance of the proposed framework and LSTM-GL-ReMF model, in this section, we conducted online prediction and imputation experiments on three spatiotemporal traffic datasets, namely the Seattle Traffic Speed dataset, Metr-LA Freeway Speed dataset (Jagadish et al., 2014) and a multi-variate traffic dataset PeMSD8 (Guo et al., 2019). The source code and datasets of our experiments can be found at https://github.com/Vadermit/TransPAI.

Conclusion and future directions

In this paper, we propose an online prediction and imputation framework that implements a novel LSTM and Graph Laplacian regularized matrix factorization model (LSTM-GL-ReMF). The proposed framework can make spatiotemporal prediction based on incomplete observation data and impute missing values at the same time. Our LSTM-GL-ReMF model preserves both spatial smoothness and non-linear temporal dynamics of data to enhance prediction and imputation performances.

Through numerical experiments on

CRediT authorship contribution statement

Jin-Ming Yang: Conception and design of study, Acquisition of data, Analysis and/or interpretation of data, Writing - original draft, Writing - review & editing. Zhong-Ren Peng: Conception and design of study, Acquisition of data, Writing - original draft. Lei Lin: Conception and design of study, Analysis and/or interpretation of data, Writing - original draft.

Acknowledgment

This study was partially supported by the National Planning Office of Philosophy and Social Science, China (No. 16ZDA048).

All authors approved the version of the manuscript to be published.

References (50)

  • ZhangW. et al.

    Short-term traffic flow prediction based on spatio-temporal analysis and CNN deep learning

    Transp. Transp. Sci.

    (2019)
  • Anava, O., Hazan, E., Zeevi, A., 2015. Online time series prediction with missing data. In: International Conference on...
  • CaiD. et al.

    Graph regularized nonnegative matrix factorization for data representation

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2010)
  • CheZ. et al.

    Recurrent neural networks for multivariate time series with missing values

    Sci. Rep.

    (2018)
  • ChenX. et al.

    Missing traffic data imputation and pattern discovery with a Bayesian augmented tensor factorization model

    Transp. Res. C

    (2019)
  • ChenC. et al.

    Detecting errors and imputing missing data for single-loop surveillance systems

    Transp. Res. Rec.

    (2003)
  • ChungJ. et al.

    Empirical evaluation of gated recurrent neural networks on sequence modeling

    (2014)
  • CrookstonN.L. et al.

    YaImpute: an R package for kNN imputation

    Journal of Statistical Software

    (2008)
  • CuiZ. et al.

    Traffic graph convolutional recurrent neural network: A deep learning framework for network-scale traffic learning and forecasting

    IEEE Trans. Intell. Transp. Syst.

    (2019)
  • Geng, X., Li, Y., Wang, L., Zhang, L., Yang, Q., Ye, J., Liu, Y., 2019. Spatiotemporal multi-graph convolution network...
  • GultekinS. et al.

    Online forecasting matrix factorization

    IEEE Trans. Signal Process.

    (2019)
  • Guo, S., Lin, Y., Feng, N., Song, C., Wan, H., 2019. Attention based spatial-temporal graph convolutional networks for...
  • HeW. et al.

    Total-variation-regularized low-rank matrix factorization for hyperspectral image restoration

    IEEE Trans. Geosci. Remote Sens.

    (2015)
  • HitchcockF.L.

    The expression of a tensor or a polyadic as a sum of products

    J. Math. Phys.

    (1927)
  • HochreiterS. et al.

    Long short-term memory

    Neural Comput.

    (1997)
  • Cited by (0)

    This article belongs to the Virtual Special Issue on IG005572: VSI:Machine learning.

    View full text