A deep learning approach to real-time CO concentration prediction at signalized intersection
Introduction
With the trend of urbanization and industrialization around the world, air pollution is becoming an urgent problem (Liu et al., 2016; Zou et al., 2009). Vehicle exhaust proves to be one of the significant sources of air pollution in the urban area (Pan et al., 2019; Wang et al., 2020, 2018). Many studies have shown that motor vehicle air pollution induced cardiovascular and respiratory diseases (Roosbroeck et al., 2008; Zhang and Peng, 2014; Zhao et al., 2013). Currently, it is indicated that complex traffic conditions at the road intersection, such as traffic flows and vehicle states, result in the increasing exhaust emission as well as additional vehicle pollution exposure to pedestrians (Gokhale, 2011; Wang et al., 2015). Therefore, using appropriate methods to predict the concentration of traffic-related pollution at the road intersection is critical and essential.
In the past decades, there have been many attempts in this regard. Liu et al. (2016) applied a land-use regression model to assess the relationship between land use and air pollution. Xie et al. (2020) developed a multivariate nonlinear grey model to forecast the traffic-related emissions at a national level based on the kernel method. Xu et al. (2019) provided a geographically weighted regression method to analyze the relationship between air pollutant emissions and traffic conditions. Niu et al. (2017) established an early warning system to forecast the day-ahead air pollution concentrations by ensembling the least square support vector machine and empirical mode decomposition. However, these research are focusing on intra-urban air pollution, and the results cannot be applicable in fine-scale intersections. Besides, the lack of data makes the model less widely used. Analyzing fine-scale variations of air pollution in microenvironments is crucial because it is more directly associated with traffic pollutants exposure to pedestrians. Some dispersion models have been used to estimate the near-road pollutant concentrations, such as CALINE4, CAL3QHC, and AERMOD (Chen et al., 2009). However, these deterministic models have limitations in revealing the dispersion characteristics of vehicle pollutants, which are highly nonlinear and have a strong association with many factors like traffic emission, geographical conditions, and meteorological conditions (Cai et al., 2009). Additionally, statistical distribution models have been applied to investigate the relationship between air pollution concentrations and meteorological variables, but these models cannot predict the entire concentration range (Gokhale and Khare, 2004; Wang et al., 2015). Subsequently, deterministic and statistical distribution models are combined to predict the vehicle pollutant concentrations, but these hybrid models require much data, which are usually not well known (Inal, 2010).
With computational advancement, the artificial neural network (ANN) has been used extensively to estimate air pollution concentrations because it is suitable for modeling nonlinear and uncertain problems. The results of these studies indicated that ANN could overcome these limitations and have great performance in air pollution prediction (Elangasinghe et al., 2014; He et al., 2014; Liu and Chen, 2020; Singh et al., 2012; Zhang and Peng, 2014). However, compared with traditional statistical models and shallow machine learning models (ANN), the deep-learning neural network can solve time series prediction problems better and achieve remarkable prediction (Lv et al., 2014; Tian and Pan, 2015) and has been widely used in traffic areas (Liu et al., 2019a). Long Short-Term Memory neural network is one of the best models because it can learn the time series with long-time spans and automatically determine the optimal time lags for prediction (Hochreiter and Schmidhuber, 1997). It can address the issue of spatial and temporal dependence simultaneously and has been successfully used in many areas (Bao et al., 2019; Ma et al., 2015; Xu et al., 2018).
It is indicated that near-road traffic pollutants are of high spatial and temporal variability, implying that the fixed observation points to monitor the air pollution would be challenging to explain the actual traffic pollution concentrations concerning space and time. One of the most pollutants released by vehicles is CO (Linna Sengkey et al., 2011), and few studies have focused on the prediction of CO at the road intersection. Therefore, to address the research gap, in this study, the portable equipment was used to measure the specific CO concentration. Furthermore, a hybrid model combining the Random Forest and Long Short-Term Memory networks is proposed to predict the CO concentration at a road intersection. The result can provide insightful information for decision-makers to decrease vehicle pollutant exposure to pedestrians by adopting real-time traffic control measures. The main contributions in this paper are summarized as follows: Develop a hybrid framework based on LSTM to predict the CO concentrations at the road intersection. The noise of time series data is removed by data preprocessing, and the random forest is used to rank the importance of traffic flow in different directions.
The remainders of this paper are organized as follows. In Section 2, the field test and data collection are introduced, and the components of the proposed hybrid framework are summarized, including data preprocessing, random forest, and LSTM natural network. The result of the hybrid framework based on LSTM and the conclusion of this paper are discussed in Section 3.
Section snippets
Field test and data collection
Field experiments (31.900402° N, 118.836858° E) were conducted at the intersection of Shuanglong Ave. and Jiyin Ave. (Fig. 1(a)). Shuanglong Ave. is a five-lane road (both directions) with a total width of 39m. Jiyin Ave. is an arterial in the Jiangning district with 6-lanes in total. The traffic volume and population density are relatively high because the intersection is near Jiulonghu Campus railway station and Tongren hospital. There are no buildings or barriers around the intersections.
Data processing results
The procedure mentioned above was conducted for three observation site data, and the data preprocessing results are shown in Fig. 5. The detailed procedure of outlier detection and replacement is as follows:
- (1)
The primary step was to determine the appropriate value of eps and MinPts of the DBSCAN algorithm. MinPts was determined by experience. Eps was determined based on the k-distance graph, ranging the distances to the k = MinPts nearest neighbor of each observation. The appropriate value of eps
Conclusion
In this study, a hybrid framework based on LSTM neural network was developed to predict the CO concentration at the intersection. To improve the accuracy of prediction, the data preprocessing was conducted firstly to remove the outliers and decrease the influence of the unavoidable noise on prediction. The status of vehicles at the intersection was also taken into consideration. Moreover, the random forest was applied to rank the importance of these variables. Finally, the variables are added
Data statement
The data used in this paper are available from the corresponding author upon request.
CRediT authorship contribution statement
Yuxuan Wang: Conceptualization, Data curation, Formal analysis, Writing - original draft. Pan Liu: Methodology, Writing - review & editing. Chengcheng Xu: Conceptualization, Methodology, Validation. Chang Peng: Data curation, Validation. Jiaming Wu: Writing - review & editing.
Declaration of competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
This work was supported by the National Key Research and Development Program of China (No. 2018YFB1600900 and SQ2018YFGH000413), Natural Science Foundation of Jiangsu Province (BK20171358) and Fundamental Research Funds for the Central Universities. The authors would like to thank the editor and the reviewers for their constructive comments and valuable suggestions to improve the quality of this article.
References (49)
- et al.
On the impact of outlier filtering on the electricity price forecasting accuracy
Appl. Energy
(2019) - et al.
A spatiotemporal deep learning approach for citywide short-term crash risk prediction with multi-source data
Accid. Anal. Prev.
(2019) - et al.
Prediction of hourly air pollutant concentrations near urban arterials using artificial neural network approach
Transport. Res. Transport Environ.
(2009) - et al.
Modeling of free swelling index based on variable importance measurements of parent coal properties by random forest method
Measurement
(2016) - et al.
Complex time series analysis of PM10 and PM2.5 for a coastal site using artificial neural network modelling and k-means clustering
Atmos. Environ.
(2014) - et al.
Random forests for big data
Big Data Res.
(2017) Traffic flow pattern and meteorology at two distinct urban junctions with impacts on air quality
Atmos. Environ.
(2011)- et al.
A review of deterministic, stochastic and hybrid vehicular exhaust emission models
Int. J. Transport Manag.
(2004) - et al.
Distance and density based clustering algorithm using Gaussian kernel
Expert Syst. Appl.
(2017) - et al.
Prediction of particulate matter at street level using artificial neural networks coupling with chaotic particle swarm optimization algorithm
Build. Environ.
(2014)
Improving the accuracy of rainfall rates from optical satellite sensors with machine learning — a random forests-based approach applied to MSG SEVIRI
Rem. Sens. Environ.
Short-term prediction of safety and operation impacts of lane changes in oscillations with empirical vehicle trajectories
Accid. Anal. Prev.
Prediction of outdoor PM2.5 concentrations based on a three-stage hybrid neural network model
Atmos. Pollut. Res.
A land use regression application into assessing spatial variation of intra-urban fine particulate matter (PM 2.5 ) and nitrogen dioxide (NO 2 ) concentrations in City of Shanghai, China
Sci. Total Environ.
DeepPF: a deep learning based architecture for metro passenger flow prediction
Transport. Res. C Emerg. Technol.
A tailored machine learning approach for urban transport network flow estimation
Transport. Res. C Emerg. Technol.
Long short-term memory neural network for traffic speed prediction using remote microwave sensor data
Transport. Res. C Emerg. Technol.
Application of decomposition-ensemble learning paradigm with phase space reconstruction for day-ahead PM 2.5 concentration forecasting
J. Environ. Manag.
An empirical comparison of alternative schemes for combining electricity spot price forecasts
Energy Econ.
The effects of handling outliers on the performance of bankruptcy prediction models
Soc. Econ. Plann. Sci.
Estimation of real-driving emissions for buses fueled with liquefied natural gas based on gradient boosted regression trees
Sci. Total Environ.
Exploring process data
J. Process Contr.
Linear and nonlinear modeling approaches for urban air quality prediction
Sci. Total Environ.
Fine-scale estimation of carbon monoxide and fine particulate matter concentrations in proximity to a road intersection by using wavelet neural network with genetic algorithm
Atmos. Environ.
Cited by (21)
Models for predicting vehicle emissions: A comprehensive review
2024, Science of the Total EnvironmentVariability of traffic-related air pollutants at two- and four-phase intersections
2023, Atmospheric Pollution ResearchPrediction method of PM2.5 concentration based on decomposition and integration
2023, Measurement: Journal of the International Measurement ConfederationTime-Series fuel consumption prediction assessing delay impacts on energy using vehicular trajectory
2023, Transportation Research Part D: Transport and EnvironmentData-driven approach for instantaneous vehicle emission predicting using integrated deep neural network
2023, Transportation Research Part D: Transport and Environment
Peer review under responsibility of Turkish National Committee for Air Pollution Research and Control.