Difference Equation Model-Based <svg xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://www.w3.org/2000/svg" style="vertical-align:-4.3442pt" id="M1" height="15.5972pt" version="1.1" viewBox="-0.0657574 -11.253 40.4343 15.5972" width="40.4343pt"><g transform="matrix(.017,0,0,-0.017,0,0)"><path id="g190-81" d="M46 650V622C120 617 128 613 128 525V125C128 42 120 34 40 28V0H311V28C221 34 212 39 212 124V281L286 262C297 261 316 261 331 263C429 275 526 338 526 468C526 533 501 579 462 609C422 638 364 650 293 650H46ZM212 559C212 588 215 600 223 606C230 613 251 618 279 618C361 618 430 572 430 464C430 337 350 302 285 302C252 302 225 309 212 314V559Z"/></g><g transform="matrix(.017,0,0,-0.017,9.661,0)"><path id="g190-78" d="M861 0V28C774 35 771 41 768 147L759 509C756 612 762 614 851 622V650H681L449 149L221 650H57V622C148 613 153 609 144 479L130 271C123 166 117 123 111 88C104 46 85 34 26 28V0H259V28C192 35 169 42 167 90C166 130 166 173 170 256L185 541H187L411 7H431L675 555H679L683 147C683 41 680 35 598 28V0H861Z"/></g><g transform="matrix(.012,0,0,-0.012,25.044,4.134)"><path id="g50-51" d="M414 144C384 79 371 75 317 75H135L276 221C367 316 408 376 408 465C408 570 327 635 237 635C179 635 131 609 100 575L42 494L67 471C94 510 138 565 205 565C277 565 321 517 321 435C321 348 258 270 195 195C146 137 88 81 33 26V0H411C423 44 433 88 446 135L414 144Z"/></g><g transform="matrix(.012,0,0,-0.012,30.895,4.134)"><path id="g50-47" d="M117 -12C153 -12 178 12 178 50C178 84 153 110 118 110C84 110 59 84 59 50C59 12 84 -12 117 -12Z"/></g><g transform="matrix(.012,0,0,-0.012,33.741,4.134)"><path id="g50-54" d="M158 548H390L417 615L410 623H122L83 318C105 326 143 337 185 337C296 337 350 275 350 188C350 116 308 42 225 42C164 42 122 74 100 93C90 101 82 99 72 92C60 82 51 68 50 59C48 46 52 38 66 24C82 9 125 -12 172 -12C225 -11 292 15 346 59C408 108 437 166 437 226C437 309 371 397 242 397C214 397 170 382 133 369L158 548Z"/></g></svg> Prediction considering the Spatiotemporal Propagation: A Case Study of Bohai Rim Region, China

Lei, Ceyu; Han, Xiaoling; Gao, Chenghua

doi:https://doi.org/10.1155/2021/6614950

Discrete Dynamics in Nature and Society

On this page

Abstract Introduction Materials and Methods Results Discussion Data Availability Conflicts of Interest Authors’ Contributions Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2021 | Article ID 6614950 | https://doi.org/10.1155/2021/6614950

Difference Equation Model-Based Prediction considering the Spatiotemporal Propagation: A Case Study of Bohai Rim Region, China

Ceyu Lei,¹Xiaoling Han,¹and Chenghua Gao¹

Academic Editor: Rigoberto Medina

Received15 Dec 2020

Revised18 Feb 2021

Accepted19 Mar 2021

Published31 Mar 2021

Abstract

Accurate reporting and prediction of concentration are very important for improving public health. In this article, we use a spectral clustering algorithm to cluster 44 cities in the Bohai Rim Region. On this basis, we propose a special difference equation model, especially the use of nonlinear diffusion equations to characterize the temporal and spatial dynamic characteristics of propagation between and within clusters for real-time prediction. For example, through the analysis of concentration data for 92 consecutive days in the Bohai Rim Region, and according to different accuracy definitions, the average prediction accuracy of the difference equation model in all city clusters is 97% or 90%. The mean absolute error (MAE) of the forecast data for each urban agglomeration is within 7 units . The experimental results show that the difference equation model can effectively reduce the prediction time, improve the prediction accuracy, and provide decision support for local air pollution early warning and urban comprehensive management.

1. Introduction

refers to particulate matter with a diameter less than or equal to 2.5 in the atmosphere, also known as fine particulate matter. Although the content of in the atmosphere is sparse, it has a significant impact on air quality and visibility. Studies have shown that, is the main source of a variety of respiratory diseases [1]. Therefore, the accurate prediction of is not only conducive to the monitoring of the existing governance effect but also provides direction for the further development of air governance in the future and can provide the people with the best travel time.

There are many studies on prediction [2, 3]. Each method deals with the problem from a different perspective. Among them, statistical methods and satellite remote sensing techniques are the most widely used. The statistical method is an empirical prediction method. Common statistical methods include linear regression models [4, 5], neural networks [3], and nonlinear regression models [6, 7]. Although the statistical method is convenient and simple to operate, it needs to collect a large amount of data in advance, and the data processing speed is slow. Although satellite remote sensing techniques [2] have wide coverage and a long time, the equipment cost is high, and it is not suitable for predicting data in a small area for a long time.

Due to the development of applied mathematics, the use of equation models to study the propagation laws and development trends of atmospheric pollutants such as has become an extremely important subject in biomathematics research. There have been many more mature studies in recent years [8–10]. For example, Wang et al. [11] first established a partial differential equation model based on space-time dimensions to predict , and then in 2020, they predicted based on a data-driven ordinary differential equation model [12]. However, these models are all differential equation models, and one of the most important assumptions for using differential equation models is the continuity of time. But in reality, the collected data are all discrete, so the establishment of a differential equation model has certain errors. The difference equation is the discretization of the differential equation, so the differential equation model is also a powerful tool for predicting data. This article uses the difference equation model to predict and analyze .

This work aims to explore large-scale (between urban areas) air pollution migration and make further predictions. Specifically, we build a specific difference equation model based on the network and clustering of 44 cities in the urban agglomeration in the Bohai Rim region, combined with local emissions and global diffusion, which is used to describe the temporal and spatial dynamic propagation process of in the region. For this model, no large-scale calculations are required. At the same time, the simulation results show that the model not only has good predictive ability but also can provide policy insights to a certain extent, providing a more scientific theoretical basis for controlling air pollution in the Bohai Rim Region.

This article mainly uses multisource data to make short-term forecasts of . This study uses data on geographic distance, wind direction, wind speed, and concentration between cities. Figure 1 shows the framework of this research. The main content of this paper includes the following parts: Section 2 gives the research area, clusters the research area according to geographical distance, wind direction and wind speed, and other conditions, and constructs the difference equation model of spatiotemporal propagation; Section 3 gives related prediction results of and the error analysis of the prediction results are carried out; Section 4 gives the summary discussion of this article and some thoughts on the later work.

2. Materials and Methods

2.1. Study Area

The Bohai Rim Region refers to the Bohai Rim coastal economic belt dominated by the Liaodong Peninsula, Shandong Peninsula, and Beijing-Tianjin-Hebei. This area accounts for about 13.31% of the country’s land area and 22.2% of the total population. At present, the concentration of is extremely high in densely populated urban agglomerations such as Beijing, Tianjin, and Hebei. Therefore, urban agglomerations in the Bohai Rim Region are facing serious problems. Figure 2 shows the study area. The red markers on the map represent the 216 major air monitoring stations in the area, and the black markers represent all prefecture-level cities in the area.

(a)

(b)

2.2. Fine Particulate Matter () Data

The city clusters in the Bohai Rim Region in this study include 44 prefecture-level cities in 5 provinces of Beijing, Tianjin, Liaoning, Hebei, and Shandong. The research data used in the study covered 92 days of concentration data from July 1, 2020, to September 30, 2020. The average daily levels for each cluster were calculated based on the daily levels of all cities in the cluster. Specifically, we calculate the average concentration of each prefecture-level city based on the data collected from 216 monitoring stations and then calculate the average concentration of each city cluster based on the concentration of each prefecture-level city concentration. All research data comes from the National Urban Air Quality Real-Time Release Platform of China Environmental Monitoring Station. The original concentrations of each city are normalized to a discrete level value 1, 2, …, and 6, according to Ambient Air Quality Standards (GB3095-1996) of China, where concentrations are divided into 0–35, 36–75, 76–115, 116–150, 151–250 and greater than 250 and these different concentration ranges are leveled from 1 to 6, describing that air quality is good, mild, moderate, severe, highly severe, and seriously severe.

2.3. Clustering and Embedding

In the study of regional transport of , we divided 44 cities in the Bohai Rim Region into four city clusters so that we could conveniently put forward a specific difference equation model to describe the transmission process of within and between clusters. The motif in Figure 3 reflects the movement of from the source of infection to the target in city network. We use as the basic module of the complex network and use the high-order spectral clustering algorithm in [13] to divide the urban agglomeration in the Bohai Rim Region into four clusters, as shown in Figure 4. For related work on high-order spectral clustering, see [14].

As mentioned above, we cluster the 44 cities in the Bohai Rim Region into 4 disjoint sets through the high-order spectral clustering algorithm, i.e., ; we will order in a meaningful way [11]. For general clustering partition, the spatial arrangement of can be based on specific modeling goals and social or geographical characteristics of the underlying network. In [15, 16], the level of democracy, diaspora size, international economic relations, and geographical proximity are used to order . In [17], friendship hops are used to define distance metric, then is embed at location based on that -axis being used as the social distance. But for , meteorological conditions are the most important factor affecting concentration. Therefore, in this study, we sorted these sets according to wind direction. From July to September, the prevailing wind direction in the Bohai Rim has been southerly, so the four city clusters are projected from south to north on the -axis of the Cartesian coordinate system and the geographic locations are named to , as shown in Figure 4.

2.4. Model

As shown in Figure 5, the pollution sources in each city cluster have a greater impact on the cluster (local emission), and different city clusters also influence each other through factors such as air flow (global transport). Therefore, this paper proposes a difference equation model with time and space factors to describe the dynamic propagation process of . For a city cluster, factories, cars, etc., in the cluster will generate a large amount of . The generation and dissipation of in the cluster can be regarded as local emission. When in a city cluster spreads to another or multiple clusters along with airflow and other factors, it can be regarded as a global transport.

In the following, we propose a nonlinear difference equation-based model to abstractly translate the transport into two processes: local emission and global transport (in Figure 5). The local emission reflects the diffusion within the cluster and the underlying network structure and is directly related to the cluster. Global transport is the spread of between clusters due to airflow and other factors, usually manifested as a more or less random walk. This approach will extend our analysis of difference equation modeling results.

Following is the description of the difference equation modelwhere(i) represents concentration in the area at time .(ii) represents the first-order forward difference of with respect to time , i.e., .(iii) represents the regional transport (global transport) of between different clusters, where .(1) describes the transport ability of the cluster at location . Different city clusters have different transportation capabilities, so a piecewise function is used to represent ; the value of each segment needs to be determined according to the actual situation.(iv) represents the spread process (local process) within a cluster. Where and are real numbers greater than 0. This mathematical expression has been used to describe and predict the dynamics of various populations, such as the growth of bacteria and tumors [18].(1) is the growth rate with time in the local process. It depicts dissipation with the external changing factors such as wind or certain other atmospheric conditions [19, 20]. Therefore the form can be expressed as are parameters, Their optimal value will be determined by the actual data collected by us.(2) is the carrying capacity of the system (the maximum possible volume of at a given location ).(v) is the initial function ( concentration at time to be , which specifies that the initial function has to be always ).(vi) is the Neumann boundary condition, which means that there is no flowing in or out at the positions and , where , i.e., at positions and , the concentration inside the cluster and the outside concentration reach an equilibrium state.

2.5. Accuracy Definition

By comparing the predicted value calculated by the model with the actual observation value, the difference equation model can be continuously optimized. denotes the actual value of concentration level and is the predicted value of concentration level.(i)(Mean Absolute Increment Accuracy), which is proposed in this paper based on the practical significance, is defined as follows: where is the number of sample points in test data set and AIA evaluates the absolute accuracy at each sample point. There are totally six concentration levels from level one to level six, and AIA describes the absolute accuracy in the view of level length [11].(ii)(Mean Relative Accuracy), which is defined as follows:where is the number of the sample points in the test data set.

2.6. Error Definition

All of the experimental results are presented in this section, and they are evaluated using three criteria: the mean absolute error (MAE), the mean absolute percentage error (MAPE), and the root mean square error (RMSE), where denotes the actual value of concentration, and is the predicted value of concentration. which are computed as follows:(i)(Mean Absolute Error): MAE is the average of the absolute error, which is defined as follows: The smaller the value of MAE, the better the accuracy of the model, which better reflects the actual situation of the predicted value error.(ii)(Root Mean Square Error): RMSE is used to measure the deviation between the observed value and the true value. RMSE is more sensitive to outliers. RMSE is defined as follows:(iii)(Mean Absolute Percentage Error): MAPE is used to measure the relative error between the predicted value and the actual value to measure the accuracy of the model. MAPE is defined as follows:

3. Results

Using the difference equation model proposed in this paper, the actual concentration, concentration level, and absolute error of in the Bohai Rim Region from July 1, 2020, to September 30, 2020, for 92 consecutive days are used to verify the real-time prediction effect of the model.

3.1. Prediction Accuracy

After collecting data from 44 cities in the Bohai Rim Region, we proposed the following prediction process: First, we normalized the data to reduce experimental errors and calculated the daily average concentration and corresponding concentration level of each cluster. We used the first day’s data to construct the initial data, and then used the three-day training data set to predict the concentration level on the fourth day. That is, we used 1–3, 2–4, 3–5 days as training data, and predicted the data on the 4th, 5th, and 6th days accordingly, and recorded the 4th, 5th, 6th… prediction accuracy of all 4 regions in the day. Specifically, we took the detailed forecasting process on day 4 as an example. The data from the first day is used to build the initial functionality. Next, we calculated the concentration change data from day 1 to day 3 and used the concentration change data on days 1–3 to calculate the parameters in the model through the lsqcurvefit function in Matlab. Finally, we used the obtained parameters to predict the data on day 4.

Figure 6 shows the predicted results of concentration levels in four city clusters from July 1, 2020, to September 30, 2020. By observing the image, it can be found that the difference equation model can effectively predict the concentration level, and the obtained prediction curve (represented by the blue line) is roughly the same as the actual curve (represented red the blue line). Therefore, the difference equation model in this article provides an accurate estimation of the concentration level, and the predicted trend is basically consistent with the actual change trend.

(a)

(b)

(c)

(d)

According to the definition of accuracy, we divide it into mean relative accuracy and mean absolute increment accuracy. In this study, the mean absolute increment accuracy reflects a precise definition of the concentration value range, while the mean relative accuracy reflects a precise definition of the concentration value. The prediction accuracy of each city-cluster for a total of 92 days from July 1, 2020, to September 30, 2020, is shown in Figures 7 and 8. According to the precision definition of MAIA, in Figure 7, most of the “” marks are located above the horizontal line 0.9, which means that the predicted value of most days in each cluster is higher than 90%, and we get that the average accuracy of the 4 clusters is all higher than 95%. Compared with Figure 7, the “+” in Figure 8 is not as dense as in Figure 7. However, through observation, it is found that the “+” in Figure 8 is mostly above the horizontal line 0.85, which shows that the relative accuracy of each cluster is higher than 80%. And through Figure 7 and Figure 8, it can be found that the MAIA and MRA of all clusters are higher than and .

(a)

(b)

(c)

(d)

(a)

(b)

(c)

(d)

3.2. Error Analysis

Figure 9 is a line graph of the predicted and actual values of concentration of cluster 1 on the right and the histogram of the actual error of cluster 1 on the left. As can be seen from the line diagram in Figure 9, the coincidence degree between the actual value curve and the predicted value curve is very high. As can be seen from the error histogram, the proportion of days with an error of 5 units is 46.94%, and the proportion of days with an error of 10 units is 84.78%. It can show that the model has good predictive performance. The prediction line graphs and error histograms of the remaining clusters are shown in Figure 10. Figure 11 shows the prediction error histogram of the model, which can prove the effectiveness and stability of the developed model. It can be seen from Figure 11 that the MAE, RMSE, and MAPE values of each city-cluster are relatively small. In summary, it can be seen that the overall performance of the difference equation model is relatively good. It can not only accurately predict the concentration of , but also provide stable data.

(a)

(b)

(a)

(b)

(c)

(d)

(e)

(f)

4. Discussion

This paper adopts a prediction method of a difference equation model based on a spectral clustering algorithm. Mainly through the analysis of weather conditions such as wind speed and wind direction in a region, a spectral clustering algorithm is used to divide a region into several city-clusters. By analyzing the relationship between in these several city-clusters, a specific difference equation model is established to describe the global and local propagation process of , thereby predicting the concentration of each cluster. After testing, the difference equation model based on the spectral clustering algorithm has high prediction accuracy and strong significance. The prediction of is basically the same as the actual observation value. It has certain practicability and can provide people with choices for travel, but there are certain defects that still need to be improved continuously in the application.

However, studies have shown that the PM_2.5 concentration is also affected by weather factors (such as temperature, humidity, wind speed, and precipitation) and other particulate matter indicators (such as CO, NO, and SO₂). Especially rainfall has a huge impact on concentration. Therefore, the next step will consider adding more weather factors and other particulate matter index data to improve the prediction accuracy of concentration.

Data Availability

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Authors’ Contributions

XH designed the study and carried out the analysis. XH and CL contributed to writing the paper. CG revised the wording of the article. CL performed numerical simulations. All authors read and approved the final manuscript.

Acknowledgments

This work was supported by the NNSF of China (No. 11561063), Natural Science Foundation of Gansu, China (No. 20JR10RA086), China Postdoctoral Science Foundation (2018M640232), and Natural Science Foundation of Tianjin, China (19JCQNJC14800).

References

C. Wen, S. Liu, X. Yao et al., “A novel spatiotemporal convolutional long short-term neural network for air pollution prediction,” Science of the Total Environment, vol. 654, no. 1, pp. 1091–1099, 2019.
View at: Publisher Site | Google Scholar
Z. Ma, X. Hu, L. Huang et al., “Estimating ground-level PM2.5 in China using satellite remote sensing,” Environmental Science & Technology, vol. 48, no. 13, pp. 7436–7444, 2014.
View at: Publisher Site | Google Scholar
X. Mao, T. Shen, and X. Feng, “Prediction of hourly ground-level PM2.5 concentrations 3 days in advance using neural networks with satellite data in eastern China,” Atmospheric Pollution Research, vol. 8, no. 6, pp. 1005–1015, 2017.
View at: Publisher Site | Google Scholar
C. Li, N. C. Hsu, and S. C. Tsay, “A study on the potential applications of satellite data in air quality monitoring and forecasting,” Atmospheric Environment, vol. 45, no. 22, pp. 3663–3675, 2011.
View at: Publisher Site | Google Scholar
N. Benas, A. Beloconi, and N. Chrysoulakis, “Estimation of urban PM10 concentration, based on MODIS and MERIS/AATSR synergistic observations,” Atmospheric Environment, vol. 79, pp. 448–454, 2013.
View at: Publisher Site | Google Scholar
E. Emili, C. Popp, M. Petitta et al., “PM10 remote sensing from geostationary SEVIRI and polar-orbiting MODIS sensors over the complex terrain of the European Alpine region,” Remote Sensing of Environment, vol. 114, no. 11, pp. 2485–2499, 2010.
View at: Publisher Site | Google Scholar
J. Tian and D. Chen, “A semi-empirical model for predicting hourly ground-level fine particulate matter (PM2.5) concentration in southern Ontario from satellite remote sensing and ground-based meteorological measurements,” Remote Sensing of Environment, vol. 114, no. 2, pp. 221–229, 2010.
View at: Publisher Site | Google Scholar
Y. Wang, “Regional-level prediction model with advection PDE model and fine particulate matter (PM2.5) concentration data,” Physics, vol. 95, no. 3, 2020.
View at: Publisher Site | Google Scholar
Y. Wang, “Quantifying prediction and intervention measures for PM2.5 by a PDE model,” Journal of Cleaner Production, vol. 268, 2020.
View at: Google Scholar
Y. Wang, H. Wang, and S. Zhang, “A weighted higher-order network analysis of fine particulate matter (PM2.5) transport in Yangtze River Delta,” Physica A, vol. 496, pp. 654–662, 2018.
View at: Publisher Site | Google Scholar
Y. Wang and H. Wang, “Prediction of daily PM2.5 concentration in China using partial differential equations,” PLoS One, vol. 13, no. 6, 2018.
View at: Publisher Site | Google Scholar
Y. Wang, “Prediction of daily PM2.5 concentration in China using data-driven ordinary differential equations,” Applied Mathematics and Computation, vol. 375, 2020.
View at: Google Scholar
Y. Wang, H. Wang, S. Chang, and M. Liu, “Higher-order network analysis of fine particulate matter (PM2.5) transport in China at city level,” Scientific Reports, vol. 7, no. 1, 2017.
View at: Publisher Site | Google Scholar
A. Benson, D. Gleich, and J. Leskovec, “Higher-order organization of complex networks,” Science, vol. 353, no. 6295, pp. 163–166, 2016.
View at: Publisher Site | Google Scholar
K. H. Kwon, “Spatiotemporal diffusion modeling of global mobilization in social media: the case of 2011 Egyptian revolution,” International Journal of Communication, vol. 10, no. 1, 2016.
View at: Google Scholar
K. H. Kwon, H. Wang, and W. W. Xu, “A Spatiotemporal Model of Twitter Information Diffusion: An Example of Egyptian Revolution,” in Proceedings of the 2015 International Conference on Social Media Society, New York, NY, USA, 2015.
View at: Google Scholar
F. Wang, H. Wang, and K. Xu, “Diffusive logistic model towards predicting information diffusion in online social networks,” in Proceedings of the 2012 32nd International Conference on Distributed Computing Systems Workshops, pp. 133–139, Macau, China.
View at: Google Scholar
L. Natr and J. D. Murray, “Mathematical biology. I. An introduction,” Photosynthetica, vol. 40, no. 3, p. 414, 2002.
View at: Publisher Site | Google Scholar
D. Chen, X. Liu, J. Lang et al., “Estimating the contribution of regional transport to PM2.5 air pollution in a rural area on the North China plain,” Science of The Total Environment, vol. 583, pp. 280–291, 2017.
View at: Publisher Site | Google Scholar
M. V. Nguyen, G. H. Park, and B. K. Lee, “Correlation analysis of size-resolved airborne particulate matter with classified meteorological conditions,” Meteorology and Atmospheric Physics, vol. 129, no. 1, pp. 35–46, 2017.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2021 Ceyu Lei et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

185

Downloads

690

Citations