Abstract
Land use regression (LUR) models have been widely used in air pollution modeling. This regression-based approach estimates the ambient pollutant concentrations at un-sampled points of interest by considering the relationship between ambient concentrations and several predictor variables selected from the surrounding environment. Although conceptually quite simple, its successful implementation requires detailed knowledge of the area, expertise in GIS, statistics, and programming skills, which makes this modeling approach relatively inaccessible to novice users. In this contribution, we present a LUR modeling and pollution-mapping software named PyLUR. It uses GDAL/OGR libraries based on the Python platform and can build a LUR model and generate pollutant concentration maps efficiently. This self-developed software comprises four modules: a potential predictor variable generation module, a regression modeling module, a model validation module, and a prediction and mapping module. The performance of the newly developed PyLUR is compared to an existing LUR modeling software called RLUR (with similar functions implemented on R language platform) in terms of model accuracy, processing efficiency and software stability. The results show that PyLUR out-performs RLUR for modeling in the Bradford and Auckland case studies examined. Furthermore, PyLUR is much more efficient in data processing and it has a capability to handle detailed GIS input data.
Similar content being viewed by others
References
Akita Y (2014a). LURTools: ArcGIS Toolbox for Land Use Regression (LUR) Model, Available online at the website of www.unc.edu~akita/lurtools
Akita Y, Baldasano J M, Beelen R, Cirach M, De Hoogh K, Hoek G, Nieuwenhuijsen M, Serre M L, De Nazelle A (2014b). Large scale air pollution estimation method combining land use regression and chemical transport modeling in a geostatistical framework. Environmental Science & Technology, 48(8): 4452–4459
Beelen R, Hoek G, Vienneau D, Eeftens M, Dimakopoulou K, Pedeli X, Tsai M Y, Künzli N, Schikowski T, Marcon A, Eriksen K T, Raaschou-Nielsen O, Stephanou E, Patelarou E, Lanki T, Yli-Tuomi T, Declercq C, Falq G, Stempfelet M, Birk M, Cyrys J, von Klot S, Nádor G, Varró M J, Dėdelė A, Gražulevičienė R, Mölter A, Lindley S, Madsen C, Cesaroni G, Ranzi A, Badaloni C, Hoffmann B, Nonnemacher M, Krämer U, Kuhlbusch T, Cirach M, De Nazelle A, Nieuwenhuijsen M, Bellander T, Korek M, Olsson D, Strömgren M, Dons E, Jerrett M, Fischer P, Wang M, Brunekreef B, De Hoogh K (2013). Development of NO2 and NOx land use regression models for estimating air pollution exposure in 36 study areas in Europe-The ESCAPE project. Atmospheric Environment, 72: 10–23
Briggs D J, Collins S, Elliott P, Fischer P, Kingham S, Lebret E, Pryl K, Van Reeuwijk H, Smallbone K, Van Der Veen A (1997). Mapping urban air pollution using GIS: A regression-based approach. International Journal of Geographical Information Science, 11(7): 699–718
European Study of Cohorts for Air Pollution Effects (2010). ESCAPE exposure assessment manual. Available online at the website of https://www.escapeproject.eu/manuals
Hoek G, Beelen R, De Hoogh K, Vienneau D, Gulliver J, Fischer P, Briggs D (2008). A review of land-use regression models to assess spatial variation of outdoor air pollution. Atmospheric Environment, 42(33): 7561–7578
Keller J P, Olives C, Kim S Y, Sheppard L, Sampson P D, Szpiro A A, Oron A P, Lindström J, Vedal S, Kaufman J D (2015). A unified spatiotemporal modeling approach for predicting concentrations of multiple air pollutants in the multi-ethnic study of atherosclerosis and air pollution. Environmental Health Perspectives, 123(4): 301–309
Kim J H (2009). Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap. Computational Statistics & Data Analysis, 53(11): 3735–3745
Kohavi R (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International Joint Conference on Artificial Intelligence (IJCAI), 14(2), 1137–1145
Li S, Zou B, Fang X, Lin Y (2019). Time series modeling of PM2.5 concentrations with residual variance constraint in eastern mainland China during 2013–2017. Science of the Total Environment, doi: 10.1016/j.scitotenv.2019.135755
Liu W, Li X, Chen Z, Zeng G, León T, Liang J, Huang G, Gao Z, Jiao S, He X, Lai M (2015). Land use regression models coupled with meteorology to model spatial and temporal variability of NO2 and PM10 in Changsha, China. Atmospheric Environment, 116: 272–280
Liu Z, Xie M, Tian K, Gao P (2017). GIS-based analysis of population exposure to PM2.5 air pollution: A case study of Beijing. Journal of Environmental Sciences (China), 59: 48–53
Ma X, Longley I, Gao J, Kachhara A, Salmond J (2019). A site-optimised multi-scale GIS based land use regression model for simulating local scale patterns in air pollution. Science of the Total Environment, 685: 134–149
Marcon A, De Hoogh K, Gulliver J, Beelen R, Hansell A L (2015). Development and transferability of a nitrogen dioxide land use regression model within the Veneto region of Italy. Atmospheric Environment, 122: 696–704
Masiol M, Zíková N, Chalupa D C, Rich D Q, Ferro A R, Hopke P K (2018). Hourly land-use regression models based on low-cost PM monitor data. Environmental Research, 167: 7–14
Meng X, Chen L, Cai J, Zou B, Wu C F, Fu Q, Zhang Y, Liu Y, Kan H (2015). A land use regression model for estimating the NO2 concentration in Shanghai, China. Environmental Research, 137: 308–315
Miskell G, Salmond J, Longley I, Dirks K N (2015). A novel approach in quantifying the effect of urban design features on local-scale air pollution in central urban areas. Environmental Science & Technology, 49(15): 9004–9011
Miskell G, Salmond J A, Williams D E (2018). Use of a handheld low-cost sensor to explore the effect of urban design features on local-scale spatial and temporal air quality variability. Science of the Total Environment, 619-620: 480–490
Morley D W, Gulliver J (2018). A land use regression variable generation, modelling and prediction tool for air pollution exposure assessment. Environmental Modelling & Software, 105: 17–23
Muttoo S, Ramsay L, Brunekreef B, Beelen R, Meliefste K, Naidoo R N (2018). Land use regression modelling estimating nitrogen oxides exposure in industrial south Durban, South Africa. Science of the Total Environment, 610-611: 1439–1447
Open Source Geospatial Foundation (2008). GDAL-OGR: Geospatial Data Abstraction Library/Simple Features Library Software, Available online at https://www.gdal.org/
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12: 2825–2830
Sanner M F (1999). Python: A programming language for software integration and development. Journal of Molecular Graphics & Modelling, 17(1): 57–61
Saucy A, Röösli M, Künzli N, Tsai M Y, Sieber C, Olaniyan T, Baatjies R, Jeebhay M, Davey M, Flückiger B, Naidoo R, Dalvie M, Badpa M, De Hoogh K (2018). Land use regression modelling of outdoor NO2 and PM2.5 concentrations in three low income areas in the Western Cape Province, South Africa. International Journal of Environmental Research and Public Health, 15(7): 1452–1465
Seabold S, Perktold J (2010). Statsmodels: Econometric and statistical modeling with python. In: Proceedings of the 9th Python in Science Conference, 57, 61
Weissert L F, Salmond J A, Miskell G, Alavi-Shoshtari M, Williams D E (2018). Development of a microscale land use regression model for predicting NO2 concentrations at a heavy trafficked suburban area in Auckland, NZ. Science of the Total Environment, 619-620: 112–119
Westra E (2013). Python geospatial development. Birmingham: Packt Publishing Ltd.
Wu J, Li J, Peng J, Li W, Xu G, Dong C (2015). Applying land use regression model to estimate spatial variation of PM2.5 in Beijing, China. Environmental Science and Pollution Research International, 22(9): 7045–7061
Xu H, Bechle M J, Wang M, Szpiro A A, Vedal S, Bai Y, Marshall J D (2019a). National PM2.5 and NO2 exposure models for China based on land use regression, satellite measurements, and universal kriging. Science of the Total Environment, 655: 423–433
Xu S, Zou B, Lin Y, Zhao X, Li S, Hu C (2019b). Strategies of method selection for fine-scale PM2.5 mapping in an intra-urban area using crowdsourced monitoring. Atmospheric Measurement Techniques. 28; 12(5):2933–48
Zhai L, Zou B, Fang X, Luo Y, Wan N, Li S (2016). Land use regression modeling of PM2.5 concentrations at optimized spatial scales. Atmosphere, 8(1): 1–15
Zou B, Pu Q, Bilal M, Weng Q, Zhai L, Nichol J E (2016). High-resolution satellite map- ping of fine particulates based on geographically weighted regression. IEEE Geoscience and Remote Sensing Letters, 13(4): 495–499
Acknowledgements
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. The development of PyLUR was inspired by the open source software RLUR. The authors thank Prof. John Gulliver and Dr. David W. Morley for providing public accessible study materials about LUR modeling on the internet. The authors also thank Abley for providing reorganized traffic volume data in Auckland.
Author information
Authors and Affiliations
Corresponding author
Additional information
Highlights
• PyLUR comprises four modules for developing and applying a LUR model.
• It considers both conventional and novel potential predictor variables.
• GDAL/OGR libraries are used to do spatial analysis in the modeling and prediction.
• Developed on Python platform, PyLUR is rather efficient in data processing.
Rights and permissions
About this article
Cite this article
Ma, X., Longley, I., Salmond, J. et al. PyLUR: Efficient software for land use regression modeling the spatial distribution of air pollutants using GDAL/OGR library in Python. Front. Environ. Sci. Eng. 14, 44 (2020). https://doi.org/10.1007/s11783-020-1221-5
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11783-020-1221-5