Applying Bayesian spatio-temporal models to demand analysis of shared bicycle
Introduction
In terms of urban short-distance travel, shared bicycle travel has great advantages in alleviating the current traffic congestion problem, connecting the public transportation system and improving the operation efficiency of urban transportation. To a certain extent, shared bicycle has solved the problem of the first–last mile of travel. Since the shared bicycle was proposed, it has strongly promoted the rapid development of public transportation and slow-moving traffic. However, there are also some challenging issues to be addressed in this process, such as unbalanced temporal and spatial distribution of borrowing and repayment, and unreasonable regional scheduling [1].
In recent years, researchers have paid much attention to the regional scheduling problem of shared bicycle. Deng Chao et al. combined the characteristics of historical borrowing and returning data of shared bicycle and used the Analytic Hierarchy Process to analyze the factors affecting the utilization rate of shared bicycle stations. The results showed that the rent and return of shared bicycle was significantly larger when it is near commercial and residential areas. It confirmed the spatial relevance of shared bicycle travel [2]. Research by Divya et al. has shown that there is a correlation between the usage of shared bicycle and taxi, weather and space [3]. Caggiani L et al. used spatio-temporal clustering to study the travel demand of shared bicycle, and proposed an algorithm for solving dynamic area clustering to determine the optimal number of shared bicycle stations [4]. Benchimol M et al. used the known demand for shared bicycle stations to solve the regional scheduling problem by establishing a static scheduling model with the smallest scheduling cost [5]. Reiss analyzed the GPS data of shared bicycle and investigated the temporal and spatial patterns of shared bicycle [6]. Yongping Zhang et al. proposed the concept of bicycle islands, which combines high-resolution trajectory data with spatial and geographical regions. The result shows that the formation of bicycle islands is closely related to the surrounding land [7]. Based on previous studies, Sergio Guidon et al. used spatial regression model and random forest model to predict shared bicycle demand. The results show that the performance of spatial regression model considering spatial lag and spatial error is better than that of random forest, which also proves that there is spatial correlation between shared bicycle demand [8]. Wen zhen Jia et al. proposed a two-level Gaussian mixture model clustering algorithm, which grouped bicycle stations and considered the migration trend and geographical location information of shared bicycles among stations. Finally, gradient lifting regression tree was used to predict traffic. To a certain extent, it solves the problem of unbalanced rent of shared bicycle at different times and different sites [9]. Tao et al. proposed a practical data-driven method to improve the regional bike sharing allocation problem, and established an integer value DEA model. Taking the area, the number of allocated bicycles and other indicators as the input, the number of shared bicycles used as the output, the parking lot of shared bicycles as the entity, and the entity efficiency is calculated by the ratio of output to input [10]. Wang et al. proposed an improved recurrent neural network for shared bicycle demand. The results show that the RNN model with complex structure can achieve long-term prediction under the premise of ensuring accuracy [11]. At present, most of the research on the allocation and demand of shared bicycle has focused on historical travel data based on regions or stations, using machine learning related algorithms to model and analyze the demand for short-term shared bicycle from the time and space dimensions [12], [13]. However, for the time and space relevance of shared bicycle riding requirements, a more complete explanation and quantification of the interdependence between them has not yet been given.
To address above issues, from the perspective of time and space, this study establishes a shared bicycle riding demand model and uses the Integrated Nested Laplace Approximation (INLA) method to estimate the model [14]. Then we further analyze the impact of shared bicycle riding demand from the interaction of time and space structure factor. The INLA method has natural advantages for the analysis of time–space models. And the Bayesian model can be quickly calculated combining with the Stochastic Partial Differential Equation (SPDE) method. It is worth noting that Markov Chain Monte Carlo (MCMC) method is also a common method for calculating Bayesian space–time models, but its limitation lies in the time-consuming calculation. The MCMC method requires multiple iterations of sampling during calculation. The analysis using MCMC method often takes several days when the data set to be analyzed is large. The advantages of the INLA method are revealed, it can avoid multiple iterative sampling and quickly analyze the model under the premise of ensuring accuracy [15]. At present, the INLA algorithm has been embedded in a variety of models and has been applied in various fields [16], [17], [18], [19], [20], but its application in the transportation field is still rare.
The rest of the paper is organized as follows: In the second section, we explain the spatiotemporal model and INLA–SPDE method; in the third section, we introduce the input data set and how to choose a series of Bayesian models with different spatio-temporal structures, then we performed a quantitative explanation and analysis of the parameters of the time–space model; finally, we summarized the full text and point out possible research directions.
Section snippets
Spatio-temporal demand model
Our analysis highlighted that the riding demand of shared bicycle is affected by time and space factors. Consequently, we developed different models. The general model structure is shown as follows: where represents the demand for shared bicycle at position at time t, and the values of t is the time scale of interest. shows the selected combination of m covariates, and the ) is the covariate
Data collection and processing
The research area, Shanghai, is located between 120°52’ to 122°12’ east longitude and 30°40’ to 31°53’ north latitude in China. It has a typical subtropical monsoon climate with distinct four seasons. In order to study some possible influencing factors on the demand for shared bicycle, we analyze the data of about 100,000 orders randomly sampled by Shanghai Mobike in August 2016. The data set includes order ID, user ID, longitude and latitude of the starting point and the ending point of the
Conclusions and future work
This paper applies the INLA method to analyze a spatial–temporal model for shared bicycle demand and measure the impact of meteorological factors, population density and per capita GDP on the shared bicycle riding demand. Major findings from the spatial–temporal analysis can be summarized as follows: (1) In the divided county-level administrative regions, we find that most of the origin points and destination points are in the same area. Then it is proposed to use origin points for modeling
CRediT authorship contribution statement
Yimeng Duan: Conceptualization, Methodology, Data curation, Writing – original draft. Shen Zhang: Visualization, Investigation, Software, Supervision. Zhuoran Yu: Software, Validation.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This research was supported by the National Key Research and Development Program of China (2018YFB1600600).
References (30)
- et al.
A dynamic clustering method for relocation process in free-floating vehicle sharing systems
Transp. Res. Proc.
(2017) - et al.
Validation of a relocation strategy for Munich’s bike sharing system
Transp. Res. Proc.
(2016) - et al.
Biking islands in cities: An analysis combining bike trajectory and percolation theory
J. Transp. Geogr.
(2019) - et al.
Expanding a (n)(electric) bicycle-sharing system to a new city: Prediction of demand with spatial regression and random forests
J. Transp. Georg.
(2020) - et al.
Hierarchical prediction based on two-level Gaussian mixture model clustering for bike-sharing system
Knowl.-Based Syst.
(2019) - et al.
A sustainability-oriented optimal allocation strategy of sharing bicycles: evidence from ofo usage in Shanghai
Resour. Conserv. Recy.
(2020) - et al.
Short-term prediction for bike-sharing service using machine learning
Transp. Res. Proc.
(2018) - et al.
Scalable Bayesian modelling for smoothing disease risks in large spatial data sets using INLA
Spat. Stat.-Neth.
(2021) - et al.
Spatial modelling of hydrothermal mineralization-related geochemical patterns using INLA+ SPDE and local singularity analysis
Comput. Geosci.-UK
(2021) - et al.
Missing traffic data imputation considering approximate intervals: A hybrid structure integrating adaptive network-based inference and fuzzy rough set
Physica A
(2021)
A Bayesian vector autoregression-based data analytics approach to enable irregularly-spaced mixed-frequency traffic collision data imputation with missing values
Transp. Res. C
Exploring urban travel patterns using density-based clustering with multi-attributes from large-scaled vehicle trajectories
Physica A
Multi-community passenger demand prediction at region level based on spatio-temporal graph convolutional network
Transp. Res. C
Bayesian modelling for spatially misaligned health and air pollution data through the INLA-SPDE approach
Spat. Stat.-Neth.
Bike sharing systems: Solving the static rebalancing problem
Discrete Optim.
Cited by (5)
Demand forecasting of shared bicycles based on combined deep learning models
2024, Physica A: Statistical Mechanics and its ApplicationsSpatio-temporal multi-graph convolutional network based on wavelet analysis for vehicle speed prediction
2023, Physica A: Statistical Mechanics and its ApplicationsLarge-scale dockless bike sharing repositioning considering future usage and workload balance
2022, Physica A: Statistical Mechanics and its ApplicationsCitation Excerpt :Clustering is an unsupervised and widely used machine learning classification method, aiming to allocate different elements into some homogeneous groups. Because most of the origins and destinations are in the same place [45], the origins of trip data are used as the clustering input. In our former research [38], there are three clustering methods applied in determining virtual stations, including K-means, DBSCAN, and spatio-temporal clustering.