Skip to main content

Advertisement

Log in

Robust empirical Bayes approach for Markov chain modeling of air pollution index

  • Research article
  • Published:
Journal of Environmental Health Science and Engineering Aims and scope Submit manuscript

Abstract

Air pollution is a matter of concern among the public, especially for those living in urban and industrial areas. Markov chain modeling is often used to model the underlying dynamics of air pollution, which involves describing the transition probability of going from one air pollution state to another. Thus, estimating the transition probability matrix for the data of the air pollution index (API) is an essential process in the modeling. However, one may observe many zero probabilities in the transition probability matrix, especially when faced with a small sample, interpreting the results with respect to the climate condition less realistic. This study proposes a robust empirical Bayes method, which incorporates a method of smoothing the zero frequencies in the count matrix, contributing to an improved estimation of the transition probability matrix. The robustness of the empirical Bayesian estimation is investigated based on Bayes risk. The transition probability matrices estimated based on the robust empirical Bayes method for the hourly API data collected from seven monitoring stations in Malaysia for the period 2012 to 2014 are used for determining the air pollution characteristics such as the mean residence time, the steady-state probability and the mean recurrence time. Furthermore, the proposed method has been evaluated by Monte Carlo simulations. Results suggest that it is quite effective in producing non-zero transition probability estimates, and superior to the maximum likelihood method in terms of minimizing the mean squared error for individual and entire transition probabilities. Therefore, the robust empirical Bayes method proves to be an improved approach to the estimation of the Markov chain. When applied to API data, it could provide important information on air pollution dynamics that may help guiding the development of proper strategies for managing the impact of air quality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Agresti A, Chuang C. Model-based Bayesian methods for estimating cell proportions in cross-classification tables having ordered categories. Comput Statistics Data Anal. 1989;7:245–58.

    Article  Google Scholar 

  2. Agresti A, Hitchcock D. Bayesian inference for categorical data analysis. JISS. 2005;14(3):297–330.

    Article  Google Scholar 

  3. Agresti, A, Kezouh, A. Association models for multi-dimensional cross-classifications of ordinal variables. Commu Statistics-Theor Methods. 1983; 12(11):1261–76.

  4. AL-Dhurafi N, Masseran N, Zamzuri Z. Compositional time series analysis for air pollution index data. Stoch Env Res Risk A. 2018;32(10):2903–11.

    Article  Google Scholar 

  5. Alyousifi Y, Masseran N, Ibrahim K. Modeling the stochastic dependence of air pollution index data. Stoch Env Res Risk A. 2018;32(6):1603–11.

    Article  Google Scholar 

  6. Alyousifi Y, Ibrahim K, Kang W, Zin WZW. Markov chain modeling for air pollution index based on maximum a posteriori method. Air Qual Atmos Health. 2019;12(12):1521–31.

    Article  CAS  Google Scholar 

  7. Alyousifi Y, Ibrahim K, Kang W, Zin WZ. Modeling the spatio-temporal dynamics of air pollution index based on spatial Markov chain model. Environmental monitoring and assessment. 2020;192(11):1–24.

  8. Azmi SZ, Latif MT, Ismail AS, Juneng L, Jemain AA. Trend and status of air quality at three different monitoring stations in the Klang Valley, Malaysia. Air Qual Atmos Health. 2010;3(1):53–64. https://doi.org/10.1007/s11869-009-0051-1.

    Article  CAS  Google Scholar 

  9. Bartoletti S, Loperfido N. Modelling air pollution data by the skew-normal distribution. Stoch Env Res Risk A. 2010;24(4):513–7. https://doi.org/10.1007/s00477-009-0341-z.

    Article  Google Scholar 

  10. Bishop YM, Fienberg SE, Holland PW. Discrete multivariate analysis: theory and practice. Springer Sci Bus Med. 2007. https://doi.org/10.1007/978-0-387-72806-3.

  11. DOE. A guide to air pollutant index in Malaysia (API). Department of environment. Ministry of Science, Technology and the Environment, Kuala Lumpur, Malaysia. 2000.

  12. Dominick D, Juahir H, Latif MT, Zain SM, Aris AZ. Spatial assessment of air quality patterns in Malaysia using multivariate analysis. Atmos Environ. 2012;60:172–81. https://doi.org/10.1016/j.atmosenv.2012.06.021.

    Article  CAS  Google Scholar 

  13. Fienberg S, Holland P. Methods for eliminating zero counts in contingency tables. Random Counts in Scienti c Work: (1) Penn State Univ. Press. 1970:233–60. 

  14. Fienberg S, Holland P. Simultaneous estimation of multinomial cell probabilities. J Am Stat Assoc. 1973;68(343):683–91.

  15. Forehead H, Huynh N. Review of modelling air pollution from traffic at street-level-The state of the science. Environ Pollut. 2018;241(6):775–86.

  16. Gass K, Klein M, Sarnat S, Winquist A, Darrow L, Flanders W, et al. Associations between ambient air pollutant mixtures and pediatric asthma emergency department visits in three cities: a classification and regression tree approach. Environ Health. 2015;58(1):1–14.

  17. Grimshaw SD, Alexander WP. Markov chain models for delinquency: Transition matrix estimation and forecasting. Appl Stoch Model Bus Ind. 2011;27(3):267–79.

  18. Grinstead CM, Snell JL. Introduction to probability. American Mathematical Soc. 2012;405–52.

  19. Halim NDA, Latif MT, Ahamad F, Dominick D, Chung JX, Juneng L, et al. The long-term assessment of air quality on an island in Malaysia. Heliyon. 2018;4(12):10–54.

    Google Scholar 

  20. Holland P, Fienberg S. Methods of eliminating zero counts in contingency tables. Biomet. 1969;25(1):191.

    Google Scholar 

  21. Hoyos L, Lara P, Ortiz E, Bracho RL, González V. Evaluation of air pollution control policies in Mexico City using finite Markov chain observation model. Rev Matemática Teoríay. 2010;16(2)255–66.

  22. Hwang Y, Kim HJ, Chang W, Yeo K, Kim Y. Bayesian pollution source identification via an inverse physics model. Comput Statistics Data Anal. 2019;134:76–92.

    Article  Google Scholar 

  23. Ibe O. Markov processes for stochastic modeling. 2nd edn. Elsevier, Newnes, UAS. 2013;59–80.

  24. Ishii S, Bell JNB, Marshall FM. Phytotoxic risk assessment of ambient air pollution on agricultural crops in Selangor State, Malaysia. Environ Pollut. 2007;150(2):267–79.

    Article  CAS  Google Scholar 

  25. Kang W, Rey SJ. Smoothed estimators for markov chains with sparse spatial observations. Geographic Analysis. 2019; 1–22. https://doi.org/10.1111/gean.12222.

  26. Kharrat T, Boshnakov GN, McHale I, Baker R. Flexible regression models for count data based on renewal processes: The Countr package. J Stat Softw. 2019;90(13):1–35.

    Article  Google Scholar 

  27. Latif MT, Azmi SZ, Noor AD, Ismail AS, Johny Z, Idrus S, et al. The impact of urban growth on regional air quality surrounding the Langat River Basin, Malaysia. Environmentalist. 2011;31:315–24. https://doi.org/10.1007/s10669-011-9340-y.

    Article  Google Scholar 

  28. Latif MT, Dominick D, Ahamad F, Khan MF, Juneng L, Hamzah FM, et al. Long term assessment of air quality from a background station on the Malaysian Peninsula. Sci Total Environ. 2014;2(132):336–48. https://doi.org/10.1016/j.scitotenv.

    Article  Google Scholar 

  29. Li J, Wang N, Wang J, Li H. Spatiotemporal evolution of the remotely sensed global continental PM2. 5 concentration from 2000-2014 based on Bayesian statistics. Environ Pollut. 2018;238:471–81.

    Article  CAS  Google Scholar 

  30. Lu P, Zhang Y, Xia G, Zhang W, Li S, Guo Y. Short-term exposure to air pollution and conjunctivitis outpatient visits: A multi-city study in China. Environ Pollut. 2019;254:113030.

    Article  CAS  Google Scholar 

  31. Ma Z, Koutsopoulos HN, Ferreira L, Mesbah M. Estimation of trip travel time distribution using a generalized Markov chain approach. Transp Res Part C Emerg Technol 2017;74(6)1–21. https://doi.org/10.1016/j.trc.2016.11.008.

  32. Masseran N, Razali AM, Ibrahim K, Latif MT. Modeling air quality in main cities of Peninsular Malaysia by using a generalized Pareto model. Environ Monit Assess. 2016;188:1–12. https://doi.org/10.1007/s10661-015-5070-9.

    Article  CAS  Google Scholar 

  33. Meshkani MR, Billard L. Empirical Bayes estimators for a finite Markov chain. Biometrika. 1992;79:185–93. https://doi.org/10.1093/biomet/79.1.185.

    Article  Google Scholar 

  34. Nebenzal A, Fishbain B. Long-term forecasting of nitrogen dioxide ambient levels in metropolitan areas using the discrete-time Markov model. Environ Model Softw. 2018;107:175–85. https://doi.org/10.1016/j.envsoft.2018.06.001.

    Article  Google Scholar 

  35. Othman J, Sahani M, Mahmud M, Ahmad M. Transboundary smoke haze pollution in Malaysia: Inpatient health impacts and economic valuation. Environ Pollut. 2014;189:194–201.

    Article  CAS  Google Scholar 

  36. Pishro-Nik, H. Introduction to probability, statistics, and random process. Kappa Research, LLC, US; 2014.

  37. Rivera-González LO, Zhang Z, Sánchez BN, Zhang K, Brown DG, Rojas-Bracho L, et al. An assessment of air pollutant exposure methods in Mexico City, Mexico. J Air Waste Manage Assoc. 2015;65(5):581–91.

    Article  Google Scholar 

  38. Rodrigues E, Achcar J. Applications of discrete-time Markov chains and Poisson processes to air pollution modeling and studies. New York: Springer Science and Business Media; 2012.

    Google Scholar 

  39. Ross S. Introduction to probability models. 11th ed. Burlington: Elsevier; 2014.

    Google Scholar 

  40. Sadek A, Ibraheem B. Approximate Bayesian Estimation of Reliability for Multistate Markov Chain. Int J Contemp Math Sci. 2010;5(1):29–40.

    Google Scholar 

  41. Sanusi W, Jemain AA, Zin WZ. Empirical Bayes estimation for Markov chain models of drought events in Peninsular Malaysia. AIP Conf Proc. 2013;157:1082–9.

    Article  Google Scholar 

  42. Sanusi W, Jemain AA, Zin WZ, Zahari M. The Drought Characteristics Using the First-Order Homogeneous Markov Chain of Monthly Rainfall Data in Peninsular Malaysia. Water Resour Manag. 2015;29:1523–39. https://doi.org/10.1007/s11269-014-0892-8.

    Article  Google Scholar 

  43. Seal B, Hossain SJ. Bayes and minimax estimation of parameters of Markov transition matrix. ProbStat Forum. 2013;6(10):107–15.

  44. Seal B, Hossain SJ. Empirical Bayes estimation of parameters in Markov transition probability matrix with computational methods. J Appl Stat. 2015;42(3):508–19. https://doi.org/10.1080/02664763.2014.963525.

    Article  Google Scholar 

  45. Spedicato G, Kang T, Yalamanchi S, Yadav D. The markovchain Package: A Package for Easily Handling Discrete Markov Chains in R. 2016. https://mirrors.xmu.edu.cn/CRAN/web/packages/markovchain/vignettes/an_introduction_to_markovchain_package.pdf

  46. Szczurek V, Maciejewska M, Połoczan’ski R, Teuerle MA. Wyłoman’ska A. Dynamics of carbon dioxide concentration in indoor air. Stoch Env Res Risk A. 2015; (29),2193–2199.

  47. Tesselkin A, Khabarov V. Estimation of Origin-Destination Matrices Based on Markov Chains. Procedia Eng. 2017;178:107–16. https://doi.org/10.1016/j.proeng.2017.01.071.

    Article  Google Scholar 

  48. Tian Y, Liu H, Liang T, Xiang X, Li M, Juan J, et al. Ambient air pollution and daily hospital admissions: A nationwide study in 218 Chinese cities. Environ Pollut. 2018;242:1042–9.

    Article  CAS  Google Scholar 

  49. Tijms H. A first course in stochastic models. Wiley, West Sussex PO19 8SQ, England. 2003.

  50. World Health Organization. Air quality guidelines: global update 2005: particulate matter, ozone, nitrogen dioxide, and sulfur dioxide. World Health Organization; 2006.

  51. Xueman Y, Wenxi L, Yongkai A, Weihong D. Assessment of parameter uncertainty for non-point source pollution mechanism modeling: A Bayesian-based approach. Environ Pollut. 2020;263(4):0269–7491. 

Download references

Acknowledgments

The authors are grateful to the Department of Environment Malaysia for their cooperation in providing the air pollution data. In addition, the authors are also indebted to the Institute of Climate Change (IPI) and Earth Observation Center (EOC) at Universiti Kebangsaan Malaysia for providing the shapefiles. We would like to thank Universiti Kebangsaan Malaysia for providing financial support under the grant GUP-2018-061.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yousif Alyousifi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

ESM 1

(DOCX 411 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alyousifi, Y., Ibrahim, K., Kang, W. et al. Robust empirical Bayes approach for Markov chain modeling of air pollution index. J Environ Health Sci Engineer 19, 343–356 (2021). https://doi.org/10.1007/s40201-020-00607-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40201-020-00607-4

Keywords

Navigation