Incorporating travel behavior regularity into passenger flow forecasting
Introduction
Recent years have witnessed the rapid development of metro systems and the continued growth of metro ridership worldwide (UITP, 2018). As an efficient and high-capacity transportation mode, the metro is playing an ever-important role in shaping future sustainable transportation. Given the growing importance of metro systems, it is critical to have a good understanding of passenger demand patterns to support service operation. A key task is to make accurate and real-time forecasting of passenger demand/ridership, which plays a vital role in a wide range of applications, including service scheduling, crowd management, and disruption response, to name but a few.
Short-term passenger flow forecasting typically focuses on forecasting the passenger flow in the next few minutes to several hours, and has been extensively studied in public transportation research. Most existing studies formulate passenger flow data as time series and follow similar methods as those applied in traffic flow forecasting. For example, statistical time series models have been widely applied to ridership forecasting problems, including auto-regressive integrated moving average (ARIMA) (Williams and Hoel, 2003, Ding et al., 2017, Chen et al., 2020a), exponential smoothing (Tan et al., 2009), and state-space/Kalman filter (Stathopoulos and Karlaftis, 2003, Jiao et al., 2016). Most of these classical time series models are linear by nature; to better characterize the non-linearity in time series data, non-linear versions or ensemble extensions of these models have also been studied (e.g., Jiao et al., 2016, Carrese et al., 2017). Recent research starts regarding the forecasting a supervised machine learning problem. On this track, some representative supervised learning models have been applied, such as support vector machine (SVM) (Chen et al., 2011, Sun et al., 2015), artificial neural network (ANN) (Vlahogianni et al., 2005, Tsai et al., 2009, Li et al., 2017), random forest (Toqué et al., 2017), and recurrent neural network (RNN)/long short-term memory (LSTM) as emerging deep learning approaches (Hao et al., 2019, Liu et al., 2019). The aforementioned research mainly focuses on modeling a univariate time series for a single metro station. However, the metro system is a network in which stations exhibit strong spatial and temporal correlations/dependencies. To extend the univariate analysis to network-wide passenger flow forecasting, some state-of-the-art models have been proposed to better characterize the complex spatiotemporal patterns and dynamics. For example, Gong et al. (2020) proposed matrix factorization models to estimate passenger flow data for each origin–destination (OD) pair; Li et al. (2019) introduced local smoothness prior based on auxiliary information (e.g., flow correlation, network typology, and POI composition) into tensor completion models to forecast passenger flow; Chen et al. (2020b) developed graph convolutional network (GCN) models to capture the complex spatiotemporal dependencies in a metro network. These new machine learning-based models have shown superior performance over traditional time series models, and they are more effective in capturing the complex patterns by incorporating domain knowledge and external features such as weather, event, time of day, and day of week.
In all the studies mentioned above, passenger flow data is generally modeled as an aggregated count time series obtained by counting the number of unique card IDs in smart card transactions. Despite the simplicity and effectiveness of these models, we would argue that the most important characteristic of passenger flow is overlooked due to the aggregation: passenger flow consists of the movement of individuals with strong regularity rooted in their travel behavior. For instance, if a passenger alights at a metro station for work in the morning, he/she will probably depart at the same station when he/she goes home in the evening. If he/she does not travel in the morning, it becomes less likely we will observe a corresponding return trip. This example clearly shows that past trips should be utilized to predict future demand, and individual travel behavior actually can result in causal structure and long-range dependencies in passenger flow time series data. Some recent studies have shown that travel behavior plays a substantial role in dynamic traffic assignment (Cantelmo and Viti, 2019) and online demand estimation (Cantelmo et al., 2020). This effect is particularly true for metro systems where passengers’ travel patterns are highly regular (Sun et al., 2013, Goulet-Langlois et al., 2017, Zhao et al., 2018b). Therefore, when developing a passenger flow forecasting model, it is essential to integrate this type of behavior-driven and long-range dependencies in addition to the local input (e.g., the past n steps in the time series).
The goal of this study is to explore the potential of incorporating an additional travel behavior component into the forecasting of passenger flow time series. Specifically, we propose a new scheme to forecast boarding/incoming passenger demand at a station by integrating historical alighting time series at the same station. We define returning passengers as those who finish their first trip at station s and also start their second trip at the same station. In other words, returning passengers refer to the individuals who stay at station s to perform an activity (e.g., home and work). In general, these return trips are not random and often exhibit strong regularity due to the activities performed. This motivates us to forecast the incoming/boarding demand from these “returning passengers” using the information on their previous trips. To achieve this, we introduce a new concept of return probability parallelogram (RPP) to better estimate returning flow, and we find that the estimated returning flow highly correlates with the overall boarding demand in a real-world data set. To further quantify the benefits of incorporating this returning flow measure, we evaluate the proposed models for one-step ahead forecasting, multi-step ahead forecasting, and forecasting under special events. Our results show that incorporating returning flow as an additional variable will consistently improve the accuracy of forecasting.
The idea of leveraging trip-level information has been introduced and examined in some recent studies, which predict the alighting flow of a station using the recent boarding flow from other related stations (see e.g., Li et al., 2017, Hao et al., 2019, Liu et al., 2019). However, the large number of boarding-alighting station pairs makes it difficult to learn an informative model at a trip level, and eventually these studies develop deep neural networks to learn the correlation from the aggregated count data in a purely data-driven approach. Our model, instead, uses the alighting of “this trip” to predict the boarding of the “next trip”, where the alighting and the boarding stations are usually the same (Barry et al., 2002, Trépanier et al., 2007). We examine this idea on a boarding flow forecasting application, which is more important to service operation and planning. The “returning flow” proposed in this paper is solely based on the intrinsic travel regularity of travelers and it does not require external information/knowledge. Our work is closely related to Zhao et al. (2018b), which proposes a probabilistic model to predict the next trip for an individual based on his/her trip history. However, instead of predicting individual trips, our primary goal is to forecast the overall passenger flow to support the decision making in service operation. In doing so, we estimate the returning flow in an aggregated approach; therefore, the framework does not require individual-based data sets that are confidential and sensitive for privacy reasons. The main contribution of this work is summarized as follows.
- •
We define returning flow to characterize the causal structure and long-range dependencies in passenger flow data, which are essentially overlooked in previous time series-based studies.
- •
We integrate returning flow as an additional covariate into standard time series models, and the proposed behavior-integrated model shows consistently improved performance in our case studies based on a real-world data set.
- •
Our model also provides a new approach to forecast passenger flows under special events.
To the best of our knowledge, this is the first research that incorporates a travel behavior component into the longstanding passenger flow forecasting problem. The remainder of the paper is organized as follows. Section 2 introduces the concept of returning flow and return probability parallelogram as the tool to integrate travel behavior regularity into the passenger flow forecasting framework. In Section 3, we develop case studies based on real-world smart card data and demonstrate the effectiveness of the proposed models in different scenarios. Finally, Section 4 concludes our research and discusses future work.
Section snippets
Methodology
In this section, we introduce returning flow and the return probability parallelogram as two fundamental building blocks in the behavior-based boarding flow forecasting framework. The proposed forecasting models are constructed by integrating returning flow as a new feature/covariate into traditional time series forecasting models. We start with a brief description of the passenger flow forecasting problem.
Experiments
In this section, we conduct numerical experiments to evaluate the effectiveness of the proposed behavior-integrated models. We choose the standard SARIMA model as the core model for time series forecasting (M0). On top of this model, we create two regression with SARIMA error models—M1 and M2—by simply incorporating the observed and the estimated as additional covariates, respectively. We evaluate the performance of these models in three scenarios: 1) one-step ahead forecasting, 2)
Conclusions and discussion
In this paper, we propose a new framework for forecasting passenger flow time series in metro systems. In contrast to some previous studies that capture temporal dynamics in a data-driven way, we try to incorporate the generative mechanisms rooted in travel behavior into modeling passenger flow time series. For that purpose, we introduce returning flow as a new covariate/feature into standard time series models. This returning flow is estimated as the expected returning boarding demand given
Acknowledgement
This research is supported by the Natural Sciences and Engineering Research Council (NSERC) of Canada, Mitacs, exo.quebec (https://exo.quebec/en), NSFC-FRQSC Research Program on Smart Cities and Big Data, the Institute for Data Valorisation (IVADO), and the Canada Foundation for Innovation (CFI).
References (36)
- et al.
Activity-based disaggregate travel demand model system with activity schedules
Transportation research part a: policy and practice
(2001) - et al.
Incorporating trip chaining within online demand estimation
Transportation Research Part B: Methodological
(2020) - et al.
Incorporating activity duration and scheduling utility into equilibrium-based dynamic traffic assignment
Transportation Research Part B: Methodological
(2019) - et al.
Dynamic demand estimation and prediction for traffic urban networks adopting new data sources
Transportation Research Part C: Emerging Technologies
(2017) - et al.
Sequence to sequence learning with attention mechanism for short-term passenger flow prediction in large-scale metro system
Transportation Research Part C: Emerging Technologies
(2019) - et al.
Forecasting short-term subway passenger flow under special events scenarios using multiscale radial basis function networks
Transportation Research Part C: Emerging Technologies
(2017) - et al.
Deeppf: A deep learning based architecture for metro passenger flow prediction
Transportation Research Part C: Emerging Technologies
(2019) - et al.
A multivariate state space approach for urban traffic flow modeling and prediction
Transportation Research Part C: Emerging Technologies
(2003) - et al.
Understanding urban mobility patterns with a probabilistic tensor factorization framework
Transportation Research Part B: Methodological
(2016) - et al.
A novel wavelet-SVM short-time passenger flow prediction in beijing subway system
Neurocomputing
(2015)