Skip to main content

Advertisement

Log in

GIS-based air quality modelling: spatial prediction of PM10 for Selangor State, Malaysia using machine learning algorithms

  • Research on Sustainable Developments for Environment Management
  • Published:
Environmental Science and Pollution Research Aims and scope Submit manuscript

Abstract

Rapid urbanization has caused severe deterioration of air quality globally, leading to increased hospitalization and premature deaths. Therefore, accurate prediction of air quality is crucial for mitigation planning to support urban sustainability and resilience. Although some studies have predicted air pollutants such as particulate matter (PM) using machine learning algorithms (MLAs), there is a paucity of studies on spatial hazard assessment with respect to the air quality index (AQI). Incorporating PM in AQI studies is crucial because of its easily inhalable micro-size which has adverse impacts on ecology, environment, and human health. Accurate and timely prediction of the air quality index can ensure adequate intervention to aid air quality management. Therefore, this study undertakes a spatial hazard assessment of the air quality index using particulate matter with a diameter of 10 μm or lesser (PM10) in Selangor, Malaysia, by developing four machine learning models: eXtreme Gradient Boosting (XGBoost), random forest (RF), K-nearest neighbour (KNN), and Naive Bayes (NB). Spatially processed data such as NDVI, SAVI, BU, LST, Ws, slope, elevation, and road density was used for the modelling. The model was trained with 70% of the dataset, while 30% was used for cross-validation. Results showed that XGBoost has the highest overall accuracy and precision of 0.989 and 0.995, followed by random forest (0.989, 0.993), K-nearest neighbour (0.987, 0.984), and Naive Bayes (0.917, 0.922), respectively. The spatial air quality maps were generated by integrating the geographical information system (GIS) with the four MLAs, which correlated with Malaysia’s air pollution index. The maps indicate that air quality in Selangor is satisfactory and posed no threats to health. Nevertheless, the two algorithms with the best performance (XGBoost and RF) indicate that a high percentage of the air quality is moderate. The study concludes that successful air pollution management policies such as green infrastructure practice, improvement of energy efficiency, and restrictions on heavy-duty vehicles can be adopted in Selangor and other Southeast Asian cities to prevent deterioration of air quality in the future.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

The data that aid the outcomes of this study are available from the Department of Environment (DOE), Malaysia. Some constraints apply to these data’s availability, which were only used under license for this study. Data are available from the first and/or corresponding author with the Department of Environment (DOE) permission.

Abbreviations

GIS:

Geographical information system

ML:

Machine learning

PMs:

Particulate matters

PM2.5:

Particulate matter finer than 2.5 μm

PM10:

Particulate matter finer than 10 μm

OECD:

Organization for Economic Cooperation and Development

XGBoost:

Extreme Gradient Boosting machine

KNN:

K-nearest neighbour

NB:

Naive Bayes

RF:

Random forest

CMAQ:

Community multi-scale air quality

WRF:

Weather Research Forecasting

AQI:

Air quality index

API:

Air pollution index

ANN:

Artificial neural network

DOE:

Department of Environment

O3:

Ozone

CO:

Carbon monoxide

SO2:

Sulfur dioxide

NO2:

Nitrogen dioxide

BU:

Built-up index

NDVI:

Normalized Difference Vegetation Index

SAVI:

Soil-adjusted vegetation index

LST:

Land surface temperature

Ws:

Wind speed

CARET:

Classification And REgression Training

APIMS:

Air pollution index of Malaysia

References

  • Abdullah SA, Nakagoshi N (2006) Changes in landscape spatial pattern in the highly developing state of Selangor, peninsular Malaysia. Landsc Urban Plan 77:263–275

  • Abdullah S, Ismail M, Fong SY (2017) Multiple linear regression (MLR) models for long term PM10 concentration forecasting during different monsoon seasons. J Sustain Sci Manag 12:60–69

  • Abdullah S, Napi NNLM, Ahmed AN, Mansor WNW, Mansor AA, Ismail M, Abdullah AM, Ramly ZTA (2020) Development of multiple linear regression for particulate matter (PM10) forecasting during episodic transboundary haze event in Malaysia. Atmosphere 11:289

  • Adams MD, Massey F, Chastko K, Cupini C (2020) Spatial modelling of particulate matter air pollution sensor measurements collected by community scientists while cycling, land use regression with spatial cross-validation, and applications of machine learning for data correction. Atmos Environ 230:117479

    Article  CAS  Google Scholar 

  • Ahamad F, Latif MT, Tang R, Juneng L, Dominick D, Juahir H (2014) Variation of surface ozone exceedance around Klang Valley, Malaysia. Atmos Res 139:116–127

    Article  CAS  Google Scholar 

  • Aini N, Mustafa MS (2020) Data mining approach to predict air pollution in Makassar. In: 2020 2nd International Conference on Cybernetics and Intelligent System (ICORIS), pp 1–5. https://doi.org/10.1109/ICORIS50180.2020.9320800

    Chapter  Google Scholar 

  • Althuwaynee OF, Balogun AL, Aydda A, Gumbo T (2020) Classification of air pollutants API Inter-Correlation using decision tree algorithms. In: IOP Conference Series: Earth and Environmental Science, vol 419, p 012022. https://doi.org/10.1088/1755-1315/419/1/012022

    Chapter  Google Scholar 

  • AlThuwaynee OF, Kim S-W, Najemaden MA, Aydda A, Balogun A-L, Fayyadh MM, Park H-J (2021) Demystifying uncertainty in PM10 susceptibility mapping using variable drop-off in extreme-gradient boosting (XGB) and random forest (RF) algorithms. Environ Sci Pollut Res. https://doi.org/10.1007/s11356-021-13255-4

  • APIMS (2020a) Air pollution index of Malaysia: frequently asked questions

    Google Scholar 

  • APIMS (2020b) ‘API Table [Hourly]’, Department of Environment (DOE), Malaysia. Accessed October 2, 2020. http://apims.doe.gov.my/public_v2/api_table.html

  • Balogun A-L, Yekeen ST, Pradhan B, Althuwaynee OF (2020) Spatio-temporal analysis of oil spill impact and recovery pattern of coastal vegetation and wetland using multispectral satellite LANDSAT 8-OLI imagery and machine learning models. Remote Sens 12:1225

    Article  Google Scholar 

  • Balogun A-L, Rezaie F, Pham QB, Gigovic L, Drobnjak S, Aina YA, Panahi M, Yekeen ST, Lee S (2021) Spatial prediction of landslide susceptibility in western Serbia using hybrid support vector regression (SVR) with GWO, BAT and COA algorithms. Geosci Front 12:101104

    Article  Google Scholar 

  • Belgiu M, Dragut L (2016) Random forest in remote sensing: a review of applications and future directions. ISPRS J Photogramm Remote Sens 114:24–31

    Article  Google Scholar 

  • Bhatti SS, Tripathi NK (2014) Built-up area extraction using Landsat 8 OLI imagery. GISci Remote Sens 51:445–467

    Article  Google Scholar 

  • Bozdağ A, Dokuz Y, Gökçek ÖB (2020) Spatial prediction of PM10 concentration using machine learning algorithms in Ankara, Turkey. Environ Pollut 263:114635. https://doi.org/10.1016/j.envpol.2020.114635

    Article  CAS  Google Scholar 

  • Breiman L (2001) Random Forests. Mach Learn 45:5–32

    Article  Google Scholar 

  • Brusseau ML, Yan N, Van Glubt S, Wang Y, Chen W, Lyu Y, Dungan B, Carroll KC, Holguin FO (2019) Comprehensive retention model for PFAS transport in subsurface systems. Water Res 148:41–50

    Article  CAS  Google Scholar 

  • Chen T, Guestrin C (2016) XGBoost: A Scalable Tree Boosting System. In: Paper presented at the Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. USA, San Francisco, California. https://doi.org/10.1145/2939672.2939785

    Chapter  Google Scholar 

  • Chen W, Xie X, Peng J, Wang J, Duan Z, Hong H (2017) GIS-based landslide susceptibility modelling: a comparative assessment of kernel logistic regression, naïve-Bayes tree, and alternating decision tree models. Geomatics Nat Hazards Risk 8:950–973

    Article  Google Scholar 

  • Chen W, Zhang S, Li R, Shahabi H (2018) Performance evaluation of the GIS-based data mining techniques of best-first decision tree, random forest, and naïve Bayes tree for landslide susceptibility modeling. Sci Total Environ 644:1006–1018

    Article  CAS  Google Scholar 

  • Choubin B, Abdolshahnejad M, Moradi E, Querol X, Mosavi A, Shamshirband S, Ghamisi P (2020) Spatial hazard assessment of the PM10 using machine learning models in Barcelona, Spain. Sci Total Environ 701:134474

    Article  CAS  Google Scholar 

  • Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20:37–46

    Article  Google Scholar 

  • de Bem PP, de Carvalho Júnior OA, Matricardi EAT, Guimarães RF, Gomes RAT (2019) Predicting wildfire vulnerability using logistic regression and artificial neural networks: a case study in Brazil’s Federal District. Int J Wildland Fire 28:35–45

    Article  Google Scholar 

  • Department of Environment, DOE (2019) ‘Information about API’, air pollutant index of Malaysia (APIMS). http://apims.doe.gov.my/public_v2/aboutapi.html

  • Department of Statistics, Malaysia (2017) Department of Statistics, Malaysia. Accessed 29 September, 2020. https://newss.statistics.gov.my/newssportalx/ep/epFreeDownloadContentSearch.seam?cid=27735

  • Di Antonio L, Rosato A, Colaiuda V, Lombardi A, Tomassetti B, Panella M (2019) Multivariate Prediction of PM10 Concentration by LSTM Neural Networks. In: 2019 Photonics & Electromagnetics Research Symposium - Fall (PIERS - Fall), pp 423–431. https://doi.org/10.1109/PIERSFall48861.2019.9021929

    Chapter  Google Scholar 

  • Duda RO, Hart PE (1973) Pattern classification and scene analysis. Wiley, New York

    Google Scholar 

  • Elia M, D'Este M, Ascoli D, Giannico V, Spano G, Ganga A, Colangelo G, Lafortezza R, Sanesi G (2020) Estimating the probability of wildfire occurrence in Mediterranean landscapes using artificial neural networks. Environ Impact Assess Rev 85:106474

    Article  Google Scholar 

  • Ganguly R, Sharma D, Kumar P (2019) Trend analysis of observational PM10 concentrations in Shimla City, India. Sustain Cities Soc 51:101719

    Article  Google Scholar 

  • Gao S, Wang Y, Shan M, Yu T, Hong N, Sun Y, Mao J, Ma Z, Xiao J, Azzi M (2020) Wind-tunnel and modelled PM10 emissions and dust concentrations from agriculture soils in Tianjin, northern China. Aeolian Res 42:100562

    Article  Google Scholar 

  • Giraldo R, Herrera L, Leiva V (2020) Cokriging prediction using as secondary variable a functional random field with application in environmental pollution. Mathematics 8:1305

    Article  Google Scholar 

  • Gore RW, Deshpande DS (2017) An approach for classification of health risks based on air quality levels. In: 2017 1st International Conference on Intelligent Systems and Information Management (ICISIM), pp 58–61. https://doi.org/10.1109/ICISIM.2017.8122148

    Chapter  Google Scholar 

  • Gou J, Qiu W, Yi Z, Shen X, Zhan Y, Weihua O (2019) Locality constrained representation-based K-nearest neighbor classification. Knowl-Based Syst 167:38–52

    Article  Google Scholar 

  • Guo G, Wang H, Bell D, Bi Y, Greer K (2003) KNN model-based approach in classification. In: Meersman R, Tari Z, Schmidt DC (eds) On the move to meaningful internet systems 2003: CoopIS, DOA, and ODBASE. OTM 2003. Lecture Notes in Computer Science, vol 2888. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39964-3_62

  • Halim NDA, Latif MT, Mohamed AF, Maulud KNA, Idrus S, Azhari A, Othman M, Sofwan NM (2020) Spatial assessment of land use impact on air quality in mega urban regions, Malaysia. Sustain Cities Soc 63:102436

    Article  Google Scholar 

  • He Q, Shahabi H, Shirzadi A, Li S, Chen W, Wang N, Chai H, Bian H, Ma J, Chen Y, Wang X, Chapi K, Ahmad BB (2019) Landslide spatial modelling using novel bivariate statistical based Naïve bayes, RBF classifier, and RBF network machine learning algorithms. Sci Total Environ 663:1–15

    Article  CAS  Google Scholar 

  • Hoek G (2017) Methods for assessing long-term exposures to outdoor air pollutants. Curr Environ Health Rep 4:450–462

    Article  CAS  Google Scholar 

  • Hu L-Y, Huang M-W, Ke S-W, Tsai C-F (2016) The distance function effect on k-nearest neighbor classification for medical datasets. SpringerPlus 5:1–9

    Article  Google Scholar 

  • IQAIR (2020) Air quality in Selangor : Air quality index (AQI) and PM2.5 air pollution in Selangor. https://www.iqair.com/us/malaysia/selangor.

  • Irga PJ, Burchett MD, Torpy FR (2015) Does urban forestry have a quantitative effect on ambient air quality in an urban environment? Atmos Environ 120:173–181

    Article  CAS  Google Scholar 

  • Jamil N, Amit N, Yusof N (2020) 'Model evaluation on air pollutant index (API) in petaling Jaya. Malaysia' 29:1959–1966

    Google Scholar 

  • Joharestani Z, Mehdi CC, Ni X, Bashir B, Talebiesfandarani S (2019) PM2. 5 Prediction based on random forest, XGBoost, and deep learning using multisource remote sensing data. Atmosphere 10:373

    Article  Google Scholar 

  • Kleine Deters J, Zalakeviciute R, Gonzalez M, Rybarczyk Y (2017) Modeling PM2.5 urban pollution using machine learning and selected meteorological parameters. Journal of Electrical and Computer Engineering 2017:5106045. https://doi.org/10.1155/2017/5106045

    Article  Google Scholar 

  • Lasheras FS, Nieto PJG, Gonzalo EG, Bonavera L, de Cos Juez FJ (2020) Evolution and forecasting of PM10 concentration at the Port of Gijon (Spain). Sci Rep 10:1–12

    Google Scholar 

  • Latif MT, Othman M, Idris N, Juneng L, Abdullah AM, Hamzah WP, Khan MF, Sulaiman NMN, Jewaratnam J, Aghamohammadi N, Sahani M, Xiang CJ, Ahamad F, Amil N, Darus M, Varkkey H, Tangang F, Jaafar AB (2018) Impact of regional haze towards air quality in Malaysia: a review. Atmos Environ 177:28–44

    Article  CAS  Google Scholar 

  • Lee S, Lee M-J, Jung H-S, Lee S (2020) Landslide susceptibility mapping using Naïve Bayes and Bayesian network models in Umyeonsan, Korea. Geocarto International 35(15):1665–1679. https://doi.org/10.1080/10106049.2019.1585482

    Article  Google Scholar 

  • Li Y, Chen Q, Zhao H, Lin W, Tao R (2015) Variations in PM10, PM2. 5 and PM1. 0 in an urban area of the Sichuan basin and their relation to meteorological factors. Atmosphere 6:150–163

    Article  Google Scholar 

  • Li W, Lu P, Li A, Luo K, Yang C, Li R, Qun X (2019) Spatial variation in the effects of air pollution on cardiovascular mortality in Beijing, China. Environ Sci Pollut Res 26:2501–2511

    Article  CAS  Google Scholar 

  • Lim CC, Ho K, Ruzmyn Vilcassim MJ, Thurston GD, Gordon T, Chen L-C, Lee K, Heimbinder M, Kim S-Y (2019) Mapping urban air quality using mobile sampling with low-cost sensors and machine learning in Seoul, South Korea. Environ Int 131:105022

    Article  CAS  Google Scholar 

  • Lu W-z (2020) Comparison of three prediction strategies within PM2. 5 and PM10 monitoring networks. Atmos Pollut Res 11:590–597

    Article  Google Scholar 

  • Ma R, Miao J, Niu L, Zhang P (2019) Transformed l1 regularization for learning sparse deep neural networks. Neural Netw 119:286–298

    Article  Google Scholar 

  • Ma J, Yu Z, Yuanhao Q, Xu J, Yu C (2020a) Application of the XGBoost machine learning method in PM2. 5 prediction: a case study of Shanghai. Aerosol Air Qual Res 20:128–138

    Article  CAS  Google Scholar 

  • Ma J, Cheng JCP, Xu Z, Chen K, Lin C, Jiang F (2020b) Identification of the most influential areas for air pollution control using XGBoost and grid importance rank. J Clean Prod 274:122835

    Article  Google Scholar 

  • Ma J, Ding Y, Cheng JCP, Jiang F, Tan Y, Gan VJL, Wan Z (2020c) Identification of high impact factors of air quality on a national scale using big data and machine learning techniques. J Clean Prod 244:118955

    Article  Google Scholar 

  • Ma X, Longley I, Gao J, Salmond J (2020d) Assessing schoolchildren’s exposure to air pollution during the daily commute - a systematic review. Sci Total Environ 737:140389

  • Mabahwi NA, Leh OLH, Omar D (2015) Urban air quality and human health effects in Selangor, Malaysia. Procedia Soc Behav Sci 170:282–291

    Article  Google Scholar 

  • Maheshwari K, Lamba S (2019) Air quality prediction using supervised regression model. In: International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT), vol 2019, pp 1–7. https://doi.org/10.1109/ICICT46931.2019.8977694

  • Munir S, Mayfield M, Coca D, Mihaylova LS, Osammor O (2020) Analysis of air pollution in urban areas with airviro dispersion model—a case study in the city of Sheffield, United Kingdom. Atmosphere 11:285

    Article  CAS  Google Scholar 

  • Mutasa S, Sun S, Ha R (2020) Understanding artificial intelligence based radiology studies: What is overfitting? Clinical Imaging 65:96–99. https://doi.org/10.1016/j.clinimag.2020.04.025

    Article  Google Scholar 

  • Noraishah MS, Syed Sharizman SAR, Faridah A (2018) Trend of dengue virus serotype in Selangor, Malaysia: a descriptive study. JP J Biostat 15:127–138

    Google Scholar 

  • OECD (2012) OECD Environmental Outlook to 2050: The Consequences of Inaction. OECD Publishing, Paris. https://doi.org/10.1787/9789264122246-en

  • Pan B (2018) Application of XGBoost algorithm in hourly PM2.5 concentration prediction. IOP Conf Ser Earth. Environ Sci 113:012127

    Google Scholar 

  • Park S, Kim M, Kim M, Namgung H-G, Kim K-T, Cho KH, Kwon S-B (2018) Predicting PM10 concentration in Seoul metropolitan subway stations using artificial neural network (ANN). J Hazard Mater 341:75–82

    Article  Google Scholar 

  • Rahman S, Syed Ismail S, Raml M, Latif MT, Zainal Abidin E, Praveena S (2015) The Assessment of Ambient Air Pollution Trend in Klang Valley, Malaysia. World Environment 5:1–11. https://doi.org/10.5923/j.env.20150501.01

    Article  Google Scholar 

  • Requia WJ, Coull BA, Koutrakis P (2019) Evaluation of predictive capabilities of ordinary geostatistical interpolation, hybrid interpolation, and machine learning methods for estimating PM2. 5 constituents over space. Environ Res 175:421–433

    Article  CAS  Google Scholar 

  • Schornobay-Lui E, Alexandrina EC, Aguiar ML, Hanisch WS, Corrêa EM, Corrêa NA (2019) Prediction of short and medium term PM10 concentration using artificial neural networks. Management of Environmental Quality 30(2):414–436. https://doi.org/10.1108/MEQ-03-2018-0055

    Article  Google Scholar 

  • Shahabi H, Shirzadi A, Ghaderi K, Omidvar E, Al-Ansari N, Clague JJ et al (2020) Flood detection and susceptibility mapping using sentinel-1 remote sensing data and a machine learning approach: hybrid intelligence of bagging ensemble based on K-nearest neighbor classifier. Remote Sensing 12(2). https://doi.org/10.3390/rs12020266

  • Shtein A, Kloog I, Schwartz J, Silibello C, Michelozzi P, Gariazzo C, Viegi G, Forastiere F, Karnieli A, Just AC (2019) Estimating daily PM2. 5 and PM10 over Italy using an ensemble model. Environ Sci Technol 54:120–128

    Article  Google Scholar 

  • Song Y, Qin S, Jiansheng Q, Liu F (2015) The forecasting research of early warning systems for atmospheric pollutants: a case in Yangtze River Delta region. Atmos Environ 118:58–69

    Article  CAS  Google Scholar 

  • Stafoggia M, Bellander T, Bucci S, Davoli M, de Hoogh K, de' Donato F, Schwartz J (2019) Estimation of daily PM10 and PM2.5 concentrations in Italy, 2013–2015, using a spatiotemporal land-use random-forest model. Environ Int 124:170–179. https://doi.org/10.1016/j.envint.2019.01.016

  • Suleiman A, Tight M, Quinn A (2020) A comparative study of using Random Forests (RF), Extreme Learning Machine (ELM) and Deep Learning (DL) algorithms in modelling Roadside Particulate Matter (PM 10 & PM 2.5). In: IOP Conference Series: Earth and Environmental Science, vol 476, p 012126. https://doi.org/10.1088/1755-1315/476/1/012126

    Chapter  Google Scholar 

  • Tagaris E, Manomaiphiboon K, Liao K-J, Leung LR, Woo J-H, He S, Amar P, Russell AG (2007) Impacts of global climate change and emissions on regional ozone and fine particulate matter concentrations over the United States. J. Geophys. Res. 112:D14312. https://doi.org/10.1029/2006JD008262

    Article  CAS  Google Scholar 

  • Taheri SH, Sodoudi S (2016) Statistical modeling approaches for PM10 prediction in urban areas; a review of 21st-century studies. Atmosphere 7:15

    Article  Google Scholar 

  • Tejasvini KN, Amith GR, Akhtharunnisa SH (2020) Air pollution forecasting using multiple time series approach. In: Mandal J, Mukhopadhyay S (eds) Proceedings of the Global AI Congress 2019. Advances in Intelligent Systems and Computing, vol 1112. Springer, Singapore. https://doi.org/10.1007/978-981-15-2188-1_8

    Chapter  Google Scholar 

  • Tella A, Balogun A-L (2021) Prediction of ambient PM10 concentration in Malaysian cities using geostatistical analyses. Journal of Advanced Geospatial Science & Technology 1(1):115–127. Retrieved from https://jagst.utm.my/index.php/jagst/article/view/9

    Google Scholar 

  • Tella A, Balogun A-L, Faye I (2021) Spatio-temporal modelling of the influence of climatic variables and seasonal variation on PM10 in Malaysia using multivariate regression (MVR) and GIS. Geomatics, Natural Hazards and Risk 12(1):443–468. https://doi.org/10.1080/19475705.2021.1879942

    Article  Google Scholar 

  • Thepnuan D, Chantara S, Lee C-T, Lin N-H, Tsai YI (2019) Molecular markers for biomass burning associated with the characterization of PM2. 5 and component sources during dry season haze episodes in Upper South East Asia. Sci Total Environ 658:708–722

    Article  CAS  Google Scholar 

  • TheStar (2020) Air pollution clears considerably in Malaysia and some cities in South-east Asia, study finds. Accessed 25 December 2020. https://www.thestar.com.my/news/regional/2020/05/08/air-pollution-clears-considerably-in-malaysia-and-some-cities-in-south-east-asia-study-finds

  • Tian B (2016) GIS technology applications in environmental and earth. CRC Press, Sciences. https://doi.org/10.1201/9781315366975

    Book  Google Scholar 

  • Tong CHM, Yim SHL, Rothenberg D, Wang C, Lin C-Y, Chen YD, Lau NC (2018) Projecting the impacts of atmospheric conditions under climate change on air quality over the Pearl River Delta region. Atmos Environ 193:79–87

    Article  CAS  Google Scholar 

  • Tosun E (2017) The evaluation of Turkey’s air quality data between 2009 and 2016

  • Trenchevski A, Kalendar M, Gjoreski H, Efnusheva D (2020) Prediction of air pollution concentration using weather data and regression models. https://doi.org/10.25673/32749

  • Tsangaratos P, Ilia I (2016) Comparison of a logistic regression and naïve Bayes classifier in landslide susceptibility assessments: the influence of models complexity and training dataset size. CATENA 145:164–179

    Article  Google Scholar 

  • Tzanis CG, Alimissis A, Philippopoulos K, Deligiorgi D (2019) Applying linear and nonlinear models for the estimation of particulate matter variability. Environ Pollut 246:89–98

    Article  CAS  Google Scholar 

  • UN (2014) World urbanization prospects: the 2014 revision, highlights. department of economic and social affairs'. Population Division, United Nations 32

  • Usmani RSA, Saeed A, Abdullahi AM, Pillai TR, Jhanjhi NZ, Hashem IAT (2020) Air pollution and its health impacts in Malaysia: a review. Air Qual Atmos Health 13:1093–1118

    Article  CAS  Google Scholar 

  • Vongruang P, Wongwises P, Pimonsree S (2017) Assessment of fire emission inventories for simulating particulate matter in Upper Southeast Asia using WRF-CMAQ. Atmos Pollut Res 8:921–929

    Article  Google Scholar 

  • Wang H-W, Li X-B, Wang D, Zhao J, He H-d, Peng Z-R (2020) Regional prediction of ground-level ozone using a hybrid sequence-to-sequence deep learning approach. J Clean Prod 253:119841

    Article  CAS  Google Scholar 

  • Wang H, Wang H, Wu Z, Zhou Y (2021) Using multi-factor analysis to predict urban flood depth based on Naive Bayes. Water 13(4). https://doi.org/10.3390/w13040432

  • Wu H, Reis S, Lin C, Heal MR (2017) Effect of monitoring network design on land use regression models for estimating residential NO2 concentration. Atmos Environ 149:24–33

    Article  CAS  Google Scholar 

  • Wong SF, Yap PS, Mak JW, Chan WLE, Khor GL, Ambu S, Chu WL, Mohamad MS, Wong NI, Majid NLA (2020) Association between long-term exposure to ambient air pollution and prevalence of diabetes mellitus among Malaysian adults. Environ Health 19:1–12

    Article  Google Scholar 

  • Xiao Q, Chang HH, Geng G, Liu Y (2018) An ensemble machine-learning model to predict historical PM2. 5 concentrations in China from satellite data. Environ Sci Technol 52:13260–13269

    Article  CAS  Google Scholar 

  • Xu S, Zou B, Shafi S, Sternberg T (2018) A hybrid Grey-Markov/ LUR model for PM10 concentration prediction under future urban scenarios. Atmos Environ 187:401–409

    Article  CAS  Google Scholar 

  • Xu C, Zhao J, Pan L (2019) A geographically weighted regression approach to investigate the effects of traffic conditions and road characteristics on air pollutant emissions. J Clean Prod 239:118084

    Article  CAS  Google Scholar 

  • Zakaria UA, Saudi ASM, Abu IF, Azid A, Balakrishnan A, Amin NA, Rizman ZI (2017) The assessment of ambient air pollution pattern in Shah Alam, Selangor, Malaysia. J Fundam Appl Sci 9:772–788

    CAS  Google Scholar 

  • Zhang J, Ma C, Liu J, Shi G (2020) Penetrating the influence of regularizations on neural network based on information bottleneck theory. Neurocomputing 393:76–82

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the department of environment (DOE), Malaysia, for providing the PM10 data for this study. The contribution of Dr. Omar Althuwaynee is much appreciated.

Author information

Authors and Affiliations

Authors

Contributions

Abdulwaheed Tella: software; writing, original draft; methodology; conceptualization; visualization; writing, review and editing; investigation. Abdul-Lateef Balogun: conceptualization, visualization, supervision, writing — review and editing.

Corresponding author

Correspondence to Abdulwaheed Tella.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable

Competing interests

The authors declare no competing interests.

Additional information

Responsible Editor: Marcus Schulz

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tella, A., Balogun, AL. GIS-based air quality modelling: spatial prediction of PM10 for Selangor State, Malaysia using machine learning algorithms. Environ Sci Pollut Res 29, 86109–86125 (2022). https://doi.org/10.1007/s11356-021-16150-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11356-021-16150-0

Keywords

Navigation