Abstract

In rainy weather, the accurate prediction of traffic status not only helps road traffic managers to formulate traffic management methods but also helps travelers design travel routes and even adjust travel time. In this paper, based on six-dimensional data (e.g., past and present spatiotemporal traffic status, road network structure, pavement type, water accumulation, and rainfall level), a fuzzy neural network (FNN) prediction system is proposed to predict traffic status. The traffic status evolution trend is related not only to the existing traffic but also to the new traffic demand. Therefore, the FNN prediction system designed includes offline and online parts using the data of the past and the day separately and avoids the forecast of new traffic demand. The fuzzy C-means clustering algorithm is applied to cluster traffic status data under similar rainy weather in the past to form an offline initial dataset, which is used to train FNN weight parameters. The online part uses real-time detection data and the parameters trained by the offline part to further predict the traffic status and returns the prediction errors to the offline part to correct the weight parameters to further improve prediction accuracy. Finally, the FNN prediction system is verified using real Beijing expressway network data. The verification results show that the prediction system can guarantee prediction accuracy and can be used to effectively identify traffic status.

1. Introduction

The development of detection technologies allows road data such as road flatness and water accumulation to be obtained in detail. Based on these advanced technologies, the traffic management departments can fully use the road data to accurately estimate the traffic status to further improve the level of road management services and alleviate traffic congestion [1]. With more realistic road information, drivers can efficiently plan their travel paths and even departure time [2].

In cities such as Beijing and Shanghai in China, drivers usually have multiple route options to reach destinations. Real-time traffic status and road information can help a driver to choose a potentially suitable travel route [3]. The potentially suitable travel route is not only related to the current traffic status and road conditions but also related to the traffic status in the future [4]. Therefore, reasonably assessing current traffic status and accurately predicting future traffic status will play an important role in traffic management and driver travel [5]. However, it is very difficult to accurately predict traffic status evolution due to various random factors (e.g., changes in rainfall, the degree of slippery roads affected by rain, visibility of drivers, etc.) [6] and the way of congestion spreading [7]. Therefore, in rainy weather, based on remote sensing road information and traffic information, it is a great challenge to reasonably predict the evolution of network traffic status [8].

The evolution process of road traffic status is not only related to road structure, rainfall, existing traffic flow, and so forth but also related to potential traffic demand [2]. Therefore, we cannot predict the evolution of traffic flow at future time steps using only the traffic status data from the past of the day, because the new traffic demand cannot be well predicted. Similarly, we cannot just use the evolution of the traffic status of similar rainy days in the past to predict the evolution of traffic status at future time steps, because the daily traffic flow status may be affected by even the change of a certain transportation facility or the repair of roads. Therefore, a fuzzy neural network (FNN) prediction system with two parts, online and offline parts, is proposed to predict the traffic status evolution [9].

We use the fuzzy C-means (FCM) clustering algorithm to perform cluster analysis on traffic status evolution in the past similar rainy weather to construct an offline initial dataset. This dataset is used by the offline part to train FNN weight parameters. The online part uses real-time detection data and the weight parameters trained by the offline part to predict traffic status. Moreover, the online part returns prediction errors and real-time data to the offline part to further correct the weight parameters to improve prediction accuracy.

This paper seeks two contributions to the literature. Firstly, a fuzzy prediction system including offline and online parts is proposed. The offline part takes into account the new traffic demand by comparison with the traffic status evolution under similar rainfall weather in the past. The comparison between the real-time traffic status evolution on the day and the prediction results of the online part is returned to the offline part to further correct the weight parameters and improve prediction accuracy. Secondly, six-dimensional data (i.e., past and present spatiotemporal traffic status, road network structure, pavement type accumulation water, and rainfall levels) are used to predict the evolution of traffic status during rain. This result provides a valuable perspective for future researchers on the evolution of relevant traffic status.

The rest of the paper is organized as follows: Section 2 briefly discusses related prediction and clustering methods; the FNN fuzzy prediction system including online and offline parts is introduced in detail in Section 3; Section 4 tests the corresponding prediction system by examples; finally, we summarize the paper and propose relevant expansion suggestions in Section 5.

2. Literature Review

2.1. Clustering Algorithms

Cluster analysis plays a vital role in the field of traffic status classification. Cluster analysis maximizes the similarity of objects in a single class and minimizes the similarity of objects between different classes. In traffic data mining, commonly used clustering algorithms include partition-based k-means algorithm [10, 11], hierarchical clustering algorithm [12], density-based spatial clustering of applications with noise algorithm [13, 14], and FCM clustering algorithm [1517].

The k-means algorithm has a fast clustering speed, but it requires that the number of categories be reasonably estimated, and the selection of the initial category center and noise have a great impact on the clustering results [10]. The hierarchical clustering algorithm includes two methods: splitting method and agglomeration method [12]. The splitting method refers to initially classifying all samples into a cluster and then gradually splitting the sample points according to a certain criterion until a certain condition or the set number of categories is reached. The agglomeration method refers to initially treating each sample point as a cluster and then merging these initial clusters according to a certain criterion until a certain condition or the set number of classifications is reached. However, hierarchical clustering algorithms are often used for automatic grouping of one-dimensional data.

The density-based spatial clustering of applications with noise algorithm also has a fast clustering speed. This algorithm can effectively process noisy data to find spatial clusters of arbitrary shapes and does not require entering the number of clusters during the clustering process [13]. However, when the density of spatial clustering is not uniform and the cluster spacing is large, the clustering results obtained by the density-based spatial clustering of applications with noise algorithm are poor. The FCM clustering algorithm is a fuzzy clustering algorithm based on objectives, which uses membership to determine the similarity of sample points. Road traffic status division has certain ambiguity. Hence, the FCM clustering algorithm is chosen to classify the traffic status according to different rainfall and water accumulation.

2.2. Traffic Status Prediction Algorithms

Over the past few decades, many scholars have devoted their efforts to improve the accuracy and effectiveness of traffic status prediction. The models used in traffic status prediction studies can be summarized into three broad categories: parameter models, nonparameter models, and hybrid models [18]. The prediction algorithm based on parameter models uses mathematical statistics to process historical traffic data. The model is preset based on certain theoretical assumptions, and then model parameters are estimated using historical data. The typical parameter model prediction algorithm used in traffic prediction is the autoregressive integrated moving average algorithm [18, 19] and Kalman filtering algorithm [20]. The parameter prediction model is relatively simple and has the advantage of fast calculation. However, this parametric model prediction algorithm is greatly affected by the volatility of traffic volume. Therefore, this parametric model prediction algorithm is only suitable for the roads with stable traffic status and low accuracy requirements. In rainy weather, low-lying areas are prone to local traffic congestion. Therefore, the parametric model prediction algorithm is not suitable for the prediction of traffic status under abnormal weather such as rain weather.

To make up for the shortcomings of parametric prediction models, some nonparametric models that can analyze the nonlinear characteristics of traffic status are proposed. Nonparametric models are roughly divided into wavelet analysis models [21, 22], chaos theory models [23, 24], and intelligent prediction models. Although nonparametric models can well predict the traffic status with time series characteristics, the model structure is complex, the calculation amount is large, and the determination of model parameters is difficult. As data-driven nonparametric algorithms, the intelligent prediction models use artificial intelligence algorithms to make predictions and are widely used in the field of traffic status prediction. Artificial intelligence prediction algorithms mainly include machine learning models such as k-nearest neighbors algorithm [25, 26], Bayesian networks [27], and support vector machines [28, 29] such as least squares support vector machine [3032], neural networks [33], and deep learning models such as back propagation neural networks [34], long-short term memory [35, 36], deep belief network [37], and generative adversarial network [38, 39].

Hybrid prediction algorithms refer to the use of two or more prediction models for traffic status prediction. Hybrid prediction models take advantage of multiple prediction models at the same time and are usually more accurate than using a single prediction model. Hybrid prediction algorithms are currently commonly used for traffic status prediction [4043]. Based on empirical mode decomposition and combination model fusion, a novel short-term traffic flow prediction approach can be used to predict traffic status [44, 45]. In rainy weather, variables, such as the degree of slippery roads and the amount of accumulated water, have uncertain characteristics. Fast and accurate traffic status prediction helps traffic management departments to control the traffic and helps travelers to choose travel routs and travel time more reasonably. Hence, this paper uses a hybrid prediction algorithm to predict traffic status.

2.3. Congestion Spreading

A large number of studies on the congestion spreading laws are conducted through complex networks. By analyzing the topological structure of road traffic network and the statistical characteristics of traffic status, the theory uses propagation dynamics models to obtain congestion propagation laws. The study in [46] described the diffusion mode of traffic congestion with the susceptible infected recovered model of complex networks. The study in [47] studied the influence of network topology on traffic congestion through an improved macroscopic traffic flow model and proposed an algorithm to effectively eliminate traffic congestion. The study in [48] proposed an idealized complex network model to analyze and predict frequent traffic congestion nodes on urban roads. The study in [49] using the complex model defined the attraction of each node in the road network. Then the traffic status of each node was distributed to other nodes, so that the congestion propagation path was determined in advance.

3. FNN Prediction System

The framework of the FNN prediction system is discussed in detail in Section 3.1. Subsequently, the FCM clustering algorithm used to filter the training dataset is discussed in Section 3.2. Input variables and output variables are discussed in Sections 3.3 and 3.4, respectively.

3.1. Framework of the FNN Prediction System

Fuzzy logic has strong ability to express structural knowledge but usually does not have the learning ability. Fuzzy logic can only subjectively learn membership functions and fuzzy rules. The neural network has strong self-learning ability and nonlinear processing ability. Hence, in this paper, the FNN algorithm combining the fuzzy logic theory with the neural network algorithm is used to predict the spatiotemporal traffic status with congestion propagation effect in rainy weather. The overall framework of the FNN prediction system is shown in Figure 1.

The FNN prediction system proposed in this paper is divided into two parts: online prediction and offline correction. Because the forecast of new traffic demand is difficult, the error of predicting traffic status based only on the data of the past time step of the day is large. Therefore, the offline part designed trains the FNN algorithm based on the data from days with rain in the past to obtain training parameters, which are input as initially training parameters for the online prediction part. Under different weather, the traffic status evolution is different. Therefore, in order to obtain suitable offline input data, this paper clusters historical data by the FCM clustering algorithm (see Section 3.2) based on rainfall and accumulated water. Then, according to the rainfall and accumulated water of the predicted day, the appropriate dataset is selected to train the offline part parameters. Based on the trained parameters of the offline part, the online part can make better use of real-time detection information to predict traffic status. Then the online part transfers the prediction errors and real-time information to the offline part to again adjust the training parameters to improve prediction accuracy. Based on the prediction results and other congestion instructions, the traffic management department understands the congestion’s spreading range and influence degree in advance to make corresponding decisions to relieve traffic congestion.

3.2. FCM Clustering Algorithm

Due to the ambiguity of road congestion level, the most widely used FCM clustering algorithm among many fuzzy clustering algorithms is applied [15]. By optimizing the objective function, the FCM clustering algorithm can obtain the membership degree of each sample point to all cluster centers, so as to determine the type of sample points to realize automatic classification of sample data. In this paper, the rainfall and water accumulation are extracted as data feature attributes.

For a given dataset , represents the total number of samples, denotes a sample, and denotes the set of samples (i.e., ). Each sample includes characteristics, where . In this paper, each sample contains two characteristics (i.e., ): rainfall and water accumulation . The standardized rainfall and water accumulation are defined aswhere , , , and .

The objective function to minimize iswhere CL is the number of clusters. represents the membership degree of the th sample belonging to the th category with weight index . denotes the category center of category . Then, solving equations (4) and (5), we can use the iteration method to update the category center and the degree of membership of category at the th iteration:

The algorithm of FCM clustering is shown in Figure 2, where represents the termination error of the FCM clustering algorithm.

3.3. Input Variables of the FNN Prediction System

In this paper, six variables (e.g., the static network structure, the past and present traffic status, pavement type, water accumulation, and rainfall level) from different information sources are selected to understand whether the congestion occurs and calculate the spread range of traffic congestion. The input data is explained in detail as follows.

3.3.1. The Static Road Network Structure

The degree of nodes in the static network structure, the connection length, and the number of links are important indicators that affect the connectivity performance of the road network. We simplify the road network into a directed graph as shown in Figure 3, where is the set of nodes, representing intersections, and is the set of edges representing road links. Through the directed graph , we can distinguish upstream and downstream traffic and can determine the topology of the road network.

3.3.2. The Dynamic Past and Present Traffic Status

Road traffic presents dynamic characteristics. A large amount of data processing will cause the prediction system to delay the evaluation of traffic status. Therefore, in order to predict the traffic status in real time, the FNN prediction system includes both offline and online parts. The offline part continuously adjusts the parameters of the FNN prediction algorithm based on the past traffic status. The online part is used to predict the evolution of the traffic status through real-time traffic status data and transfers the prediction errors and real-time information to the offline part to again adjust the training parameters to improve prediction accuracy.

3.3.3. The Pavement Types

The type of roads can already be accurately detected by remote sensing equipment. On rainy days, low-lying areas are prone to water accumulation. Small low-lying areas may unintentionally affect the driving speed of vehicles, while in large low-lying areas, drivers may actively reduce the speed of vehicles to avoid flooding. The IRI (International Roughness Index) can be used to classify pavement types. IRI is a road surface flatness that is stated by the number of vertical changes in the road surface for each unit of road length [50]. Therefore, according to Table 1, this paper divides the pavement into three categories: small low-lying pavement, large low-lying pavement, and smooth pavement.

3.3.4. The Water Accumulation and Rainfall Level

We select a total of seven water accumulation points in Beijing, of which five large water accumulation points and two small water accumulation points are used to measure the amount of water accumulation. The rainfall level is divided into three levels: light rain, moderate rain, and heavy rain. Based on water accumulation and rainfall level, the offline input data can be obtained through the FCM clustering algorithm. Through the amount of water accumulation, we can further study the cumulative effect of congestion. Meanwhile, combining the water accumulation and rainfall level of the predicted day, the appropriate data are selected to train the offline part parameters.

3.4. The Output of the FNN Prediction System

Generally, the topology of the road network affects congestion propagation. However, in rainy weather, water accumulation points will cause the water accumulation to increase continuously over time. Therefore, water accumulation points may become the specific key node of road network and may cause the increase of the effect of the congestion propagation on traffic status evolution. In this paper, the road congestion level under rainy weather is used as the basis for judging the road operation status. Vehicle speed is the key indicator of road congestion level. The online part predicts the vehicle speed combining real-time detection information and initial training parameters. The road network tested in this paper is the expressway network in Beijing. Therefore, according to Table 2, the speed can be converted to the road congestion level of the expressway network [51].

The speed performance indexes are defined as follows [51]:where denotes the speed performance index, represents the average speed of links in km/h, and is the maximum speed limit of links in km/h.

4. Case Study

In this section, the accuracy of the proposed FNN prediction system is evaluated through the actual traffic status data of Beijing’s expressway network (see Figure 4). The expressway network includes four ring roads (second ring, third ring, fourth ring, and fifth ring) and 15 urban express roads. As the expressway network is responsible for more than 50% of daily motor vehicle trips, the expressway network can be selected to represent the approximate level of Beijing’s traffic network. The road network can be further divided into 244 road links. We use the data provided by Beijing Traffic Management Bureau for the three years of 2015, 2016, and 2017. The data sampling interval is 5 minutes, and there are 22162169, 22709459, and 20465423 basic pieces of data in these three years. The regression model filling method is applied to supplement missing data [52]. We select the morning peak period (7:00 to 8:00 a.m.) of June 23, 2017 as the forecast time period. Five large water accumulation points marked in blue and two small water accumulation points marked in red shown in Figure 4 are selected.

The FNN model is implemented using Python 3.5.4 and Pytorch 1.50 deep learning framework. Data preprocessing is conducted with the packages NumPy and Pandas. The dataset of GPS is divided into three subsets for the traffic status prediction under rainy weather: a training set, a validation set, and a test set. To effectively use the limited dataset and avoid overfitting during training, k-fold cross validation is applied, where the number of training set folds are set to 9, and the last month dataset is utilized for validation. In this study, the number of hidden layers is set to two. The number of neurons and batch size are set as 20 and 16. The number of epochs is manually set to 200 and the mini-batch learning method with a batch size of 16 is adopted. The membership is set to 3. To prevent overfitting on the training set, an early stop is set when the loss is no longer reduced in 20 consecutive training cycles.

4.1. Performance Measures

Four measurement methods, named the mean absolute percentage error (MAPE), the mean absolute error (MAE), the root mean square error (RMSE), and , are used to evaluate the prediction accuracy of vehicle speed. MAPE and MAE can accurately evaluate the size of prediction errors. RMSE and are the difference between the measured value and the predicted value. These four methods can evaluate the accuracy of prediction from different aspects:where is the measured value of link , is the predicted value of link , and is the total number of predicted links. is the average observed value and is the average prediction value. The above models can be used to evaluate the prediction results of speed.

4.2. Forecast Accuracy Assessment

In order to evaluate the prediction accuracy of vehicle speed, we choose eight areas with dense water points as the evaluation objects (see Figure 5). Six cases with different input variables (see Table 3) are tested. The detailed prediction results of average speed are shown in Figure 6. The prediction errors of the six cases from MAPE, RMSE, , and MAE are shown in Figure 7.

From Figure 6, we can find that when all variables are considered, the accuracy of prediction is improved. Case I and Case II, which do not consider the road network structure, have the largest prediction errors. The prediction results of Case I and Case II are similar. When more water accumulation points are included in the prediction areas, the prediction errors increase. As the number of water accumulation points in the prediction area decreases, prediction accuracy is improved. The results of Case III, Case IV, and Case V are similar, because the pavement structure, water accumulation, and rainfall can better reflect the impact of rainfall on traffic status. From Figure 7, we can clearly find that, for all areas, the prediction accuracy of Case VI is the highest, and the prediction accuracy of Case I is the lowest. Furthermore, the results of Case II and Case I are similar, indicating that the road static network structure has a greater influence on the prediction of vehicle speed.

4.3. Comparison of Different Forecasting Algorithms

In this section, in order to compare with the FNN algorithm used in this paper, the other two prediction algorithms, convolutional neural network (CNN) and artificial neural network (ANN), are used to predict traffic status. For the sake of simplicity, all tests are based on Case VI shown in Table 3. The CNN is based on the time-space-time structure proposed by Yu and Zhu [53]. The number of all CNN convolution kernels is 1. The number of input channels of the first time-gated convolution network in the spatiotemporal convolution block is 1, the output channels are 32, and the excitation function is GLU. The number of input channels of the spatial graph convolutional neural network is 32, the number of output channels is 32, and the excitation function is ReLU. The number of input channels of the second time-gated convolutional network is 32, the number of output channels is 64, and the excitation function is ReLU. RMSprop is used to minimize the mean square error of 50 rounds to train the model, the batch size is 50, and the initial learning rate is 0.001. For the ANN, this paper uses one hidden layer and nine hidden layer units. Four prediction error evaluation methods are applied to evaluate the prediction results, and the results are shown in Figure 8. From the results, we find that, compared with other two prediction algorithms, the FNN prediction algorithm used in this paper can get good prediction results in some areas.

4.4. Road Traffic Status Forecast

For the prediction of traffic status, we predict the vehicle speed and then convert it to the speed performance indexes presented in Table 2 to show the road congestion level. The prediction results are shown in Figure 9. In general, with the increase of rainfall, the overall road congestion level continues to decrease, especially in the vicinity of large water accumulation points. Subsequently, we select a small water accumulation point 1 presented in Figure 4 to construct a three-dimensional road congestion level evolution diagram. The evolution diagram is shown in Figure 10. From Figure 10, we can find that the water accumulation point is more prone to large-scale congestion in the surrounding area. The water accumulation point causes a large number of drivers to consciously reduce the speed of vehicles, so it is easy to form a longer queue length and stop-and-go congestion waves.

5. Conclusion

In rainy weather, the prediction of traffic status not only helps traffic managers to formulate traffic management methods but also helps travelers choose appropriate travel routes and departure times. Predicting the traffic status of future time steps based only on the historical data of the past time step of the day cannot take into account the changes in travel demand, especially the new traffic demand. However, predicting the evolution of traffic status based on the data from similar rainy weather in the past will cause a drop in prediction accuracy. Therefore, this paper proposes a FNN prediction system consisting of online and offline parts. The offline part uses the data filtered by the FCM clustering algorithm to train the weight parameters of the FNN prediction algorithm. This parameter is entered as initial weight parameter in the online part. Then the online part uses real-time detection data to predict the traffic status. The online part returns the errors between the predicted and actual traffic status to the offline part to further correct weight parameters to further improve prediction accuracy. The FNN prediction system is applied to the traffic status prediction of the expressway network in Beijing.

In this paper, the online and offline parts only use the FNN prediction algorithm to predict the evolution of traffic status. Therefore, other prediction algorithms such as machine learning and deep learning algorithms will be further used to improve prediction accuracy. The accumulation water data used in this paper are based on measurements. When there are many accumulation water points and the rainfall time is long, it is unrealistic to measure the accumulation water volume in real time, so the image-based data processing algorithm is worth further study.

Data Availability

No data were used to support the study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research was supported by the National Natural Science Foundation of China (U1811463).