Skip to main content

Advertisement

Log in

An adapted geographically weighted LASSO (Ada-GWL) model for predicting subway ridership

  • Published:
Transportation Aims and scope Submit manuscript

Abstract

Ridership prediction at station level plays a critical role in subway transportation planning. Among various existing ridership prediction methods, direct demand model has been recognized as an effective approach. However, direct demand models including geographically weighted regression (GWR) have rarely been studied for local model selection in ridership prediction. In practice, acquiring insights into subway ridership under multiple influencing factors from a local perspective is important for passenger flow management and transportation planning operations adapting to local conditions. In this study, we propose an adapted geographically weighted LASSO (Ada-GWL) framework for modelling subway ridership, which involves regression-coefficient shrinkage and local model selection. It takes subway network layout into account and adopts network-based distance metric instead of Euclidean-based distance metric, making it so-called adapted to the context of subway networks. The real-world case of Shenzhen Metro is used to elaborate our proposed model. The results show that the proposed Ada-GWL model performs the best compared with the global model (ordinary least square, GWR, GWR calibrated with network-based distance metric and geographically weighted LASSO (GWL) in terms of estimation error and goodness-of-fit. Through understanding the variation of each coefficient across space (elasticities) and variables selection of each station, it provides more realistic conclusions based on local analysis. Besides, through clustering analysis of the stations according to the regression coefficients, clusters’ functional characteristics are found to be in compliance with the policy of functional land use in Shenzhen, indicating the high interpretability of Ada-GWL model from the spatial angle. In other words, the regression coefficients of different stations can provide us the local prospective to understand the influence of factors on stations’ ridership.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. Source: http://toursmaps.com/wp-content/uploads/2017/02/shenzhen_metro_map-1.gif.

  2. Source: http://www.sztj.gov.cn/nj2014/indexeh.htm.

  3. Source: http://www.urbanrail.net/as/cn/shen/shenzhen.htm.

  4. Source: http://www.worldpop.org.uk/data/get_data/.

  5. Source: https://map.baidu.com/.

  6. Source: https://maps.google.com.

  7. Average daily ridership of a whole week is the average of total ridership of seven days of week in operation times (6:30–23:00). Rush hour ridership is calculated by the total ridership of evening rush-hours from 17:00 to 19:00 of a whole week divided by 14 h (multiply 2 h by 7 days). Non-rush hour ridership is calculated by the total ridership of remaining time (9:00–17:00, 19:00–23:00) except for morning (7:00–9:00) and evening rush-hours of a whole week divided by 84 h (multiply 12 h by 7 days).

  8. Source: http://www.szgeoinfo.com:5001/msmap/flex/landuse.html.

References

  • Brunsdon, C., Fotheringham, A.S., Charlton, M.E.: Geographically weighted regression: a method for exploring spatial nonstationarity. Geogr. Anal. 28(4), 281–298 (1996)

    Google Scholar 

  • Burnham, K.P., Anderson, D.R.: Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. Springer, Berlin (2002)

    Google Scholar 

  • Cardozo, O.D., García-Palomares, J.C., Gutiérrez, J.: Application of geographically weighted regression to the direct forecasting of transit ridership at station-level. Appl. Geogr. 34, 548–558 (2012)

    Google Scholar 

  • Cervero, R.: Alternative approaches to modeling the travel-demand impacts of smart growth. J. Am. Plan. Assoc. 72(3), 285–295 (2006)

    Google Scholar 

  • Chan, S., Miranda-Moreno, L.: A station-level ridership model for the metro network in Montreal, Quebec. Can. J. Civ. Eng. 40(3), 254–262 (2013)

    Google Scholar 

  • Choi, J., Lee, Y.J., Kim, T., Sohn, K.: An analysis of metro ridership at the station-to-station level in Seoul. Transportation 39(3), 705–722 (2012)

    Google Scholar 

  • Chow, L.F., Zhao, F., Liu, X., Li, M.T., Ubaka, I.: Transit ridership model based on geographically weighted regression. Transp. Res. Rec. 1, 105–114 (2006)

    Google Scholar 

  • Chu, X.: Ridership models at the stop level. National Center for Transit Research, University of South Florida, Tech. rep. (2004)

  • Cleveland, W.S.: Robust locally weighted regression and smoothing scatterplots. Publ. Am. Stat. Assoc. 74(368), 829–836 (1979)

    Google Scholar 

  • Cressie, N.A.C.: Statistics for Spatial Data. Wiley, New York (1993)

    Google Scholar 

  • Csardi, G., Nepusz, T.: The igraph software package for complex network research. Int. J. Complex Syst. 1695, 1–9 (2006)

    Google Scholar 

  • Deng, J., Xu, M.: Characteristics of subway station ridership with surrounding land use: a case study in Beijing. In: 2015 International Conference on Transportation Information and Safety (ICTIS). IEEE, pp. 330–336 (2015)

  • Douglas Nychka, J.P., Furrer, R., Sain, S.: Fields: Tools for Spatial Data. https://CRAN.R-project.org/package=fields, r package version 9.6 (2018)

  • Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32(2), 407–499 (2004)

    Google Scholar 

  • Environmental Systems Research Institute, Inc (ESRI): Arcgis (2014). http://desktop.arcgis.com/en/arcmap/

  • Erciyes, K.: Complex Networks: An Algorithmic Perspective. CRC Press, Inc, Boca Raton (2014)

    Google Scholar 

  • Estupiñán, N., Rodríguez, D.A.: The relationship between urban form and station boardings for Bogota’s BRT. Transp. Res. Part A Policy Pract. 42(2), 296–306 (2008)

    Google Scholar 

  • Gauraha, N.: Introduction to the LASSO. Resonance 23(4), 439–464 (2018)

    Google Scholar 

  • Guerra, E., Cervero, R., Tischler, D.: Half-mile circle: does it best represent transit station catchments? Transp. Res. Rec. J. Transp. Res. Board 2276, 101–109 (2012)

    Google Scholar 

  • Guo, L., Ma, Z., Zhang, L.: Comparison of bandwidth selection in application of geographically weighted regression: a case study. Can. J. For. Res. 38(9), 2526–2534 (2008)

    Google Scholar 

  • Gutiérrez, J., García-Palomares, J.C.: Distance-measure impacts on the calculation of transport service areas using GIS. Environ. Plan. B Plan. Des. 35(3), 480–503 (2008)

    Google Scholar 

  • Gutiérrez, J., Cardozo, O.D., García-Palomares, J.C.: Transit ridership forecasting at station level: an approach based on distance-decay weighted regression. J. Transp. Geogr. 19(6), 1081–1092 (2011)

    Google Scholar 

  • Hu, Y., Miller, H.J., Li, X.: Detecting and analyzing mobility hotspots using surface networks. Trans. GIS 18(6), 911–935 (2014)

    Google Scholar 

  • Hu, N., Legara, E.F., Lee, K.K., Hung, G.G., Monterola, C.: Impacts of land use and amenities on public transport use, urban planning and design. Land Use Policy 57, 356–367 (2016)

    Google Scholar 

  • Hu, Y., Wang, F., Guin, C., Zhu, H.: A spatio-temporal kernel density estimation framework for predictive crime hotspot mapping and evaluation. Appl. Geogr. 99, 89–97 (2018)

    Google Scholar 

  • Jun, M.J., Choi, K., Jeong, J.E., Kwon, K.H., Kim, H.J.: Land use characteristics of subway catchment areas and their influence on subway ridership in Seoul. J. Transp. Geogr. 48, 30–40 (2015)

    Google Scholar 

  • Kim, D., Elek, P., Mirjana, R.: Mapping Urbanities: Morphologies, Flows, Possibilities. Routledge, Abingdon (2017)

    Google Scholar 

  • Kuby, M., Barranda, A., Upchurch, C.: Factors influencing light-rail station boardings in the United States. Transp. Res. Part A Policy Pract. 38(3), 223–247 (2004)

    Google Scholar 

  • Li, J., Yao, M., Fu, Q.: Forecasting method for urban rail transit ridership at station level using back propagation neural network. Discrete Dyn. Nat. Soc. (2016). https://doi.org/10.1155/2016/9527584

    Article  Google Scholar 

  • Liu, C., Erdogan, S., Ma, T., Ducca, F.W.: How to increase rail ridership in maryland: direct ridership models for policy guidance. J. Urban Plan. Dev. 142(4), 04016017 (2016)

    Google Scholar 

  • Loo, B.P., Chen, C., Chan, E.T.: Rail-based transit-oriented development: lessons from New York city and Hong Kong. Landsc. Urban Plan. 97(3), 202–212 (2010)

    Google Scholar 

  • Lu, B., Harris, P., Charlton, M., Brunsdon, C., Nakaya, T., Murakami, D., Gollini, I.: GWmodel: Geographically-Weighted Models (2018). https://CRAN.R-project.org/package=GWmodel, r package version 2.0-6

  • Marshall, N., Grady, B.: Sketch transit modeling based on 2000 census data. Transp. Res. Rec. J. Transp. Res. Board 1986, 182–189 (2006)

    Google Scholar 

  • McNally, M.G.: The four step model (2000)

  • Moran, P.A.: Notes on continuous stochastic phenomena. Biometrika 37(1/2), 17–23 (1950)

    Google Scholar 

  • Nakaya, T., Fotheringham, A.S., Brunsdon, C., Charlton, M.: Geographically weighted Poisson regression for disease association mapping. Stat. Med. 24(17), 2695–2717 (2005)

    Google Scholar 

  • Pan, H., Li, J., Shen, Q., Shi, C.: What determines rail transit passenger volume? Implications for transit oriented development planning. Transp. Res. Part D Transp. Environ. 57, 52–63 (2017)

    Google Scholar 

  • R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2009). http://www.R-project.org, ISBN 3-900051-07-0

  • Singhal, A., Kamga, C., Yazici, A.: Impact of weather on urban transit ridership. Transp. Res. Part A Policy Pract. 69, 379–391 (2014)

    Google Scholar 

  • Sohn, K., Shim, H.: Factors generating boardings at metro stations in the seoul metropolitan area. Cities 27(5), 358–368 (2010)

    Google Scholar 

  • Sung, H., Oh, J.T.: Transit-oriented development in a high-density city: Identifying its association with transit ridership in Seoul, Korea. Cities 28(1), 70–82 (2011)

    Google Scholar 

  • Taylor, B.D., Miller, D., Iseki, H., Fink, C.: Analyzing the determinants of transit ridership using a two-stage least squares regression on a national sample of urbanized areas (2003)

  • Thompson, G., Brown, J., Bhattacharya, T.: What really matters for increasing transit ridership: understanding the determinants of transit ridership demand in Broward County, Florida. Urban Stud. 49(15), 3327–3345 (2012)

    Google Scholar 

  • Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58(1), 267–288 (1996)

    Google Scholar 

  • Trevor Hastie, B.E.: LARS: Least Angle Regression, Lasso and Forward Stagewise. https://CRAN.R-project.org/package=lars, r package version 1.2 (2013)

  • Walters, G., Cervero, R.: Forecasting Transit Demand in a Fast Growing Corridor: The Direct-Ridership Model Approach. Fehrs and Peers Associates, Oakland (2003)

    Google Scholar 

  • Wheeler, D.C.: Diagnostic tools and a remedial method for collinearity in geographically weighted regression. Environ. Plan. A 39(10), 2464–2481 (2007)

    Google Scholar 

  • Wheeler, D.C.: Simultaneous coefficient penalization and model selection in geographically weighted regression: the geographically weighted lasso. Environ. Plan. A 41(3), 722–742 (2009)

    Google Scholar 

  • Wheeler, D.: GWRR: fits geographically weighted regression models with diagnostic tools. https://CRAN.R-project.org/package=gwrr, r package version 0.2-1 (2013)

  • Wheeler, D., Tiefelsdorf, M.: Multicollinearity and correlation among local regression coefficients in geographically weighted regression. J. Geogr. Syst. 7(2), 161–187 (2005)

    Google Scholar 

  • Zhang, D., Wang, X.C.: Transit ridership estimation with network kriging: a case study of Second Avenue Subway, NYC. J. Transp. Geogr. 41, 107–115 (2014)

    Google Scholar 

  • Zhao, J., Deng, W., Song, Y., Zhu, Y.: What influences metro station ridership in China? Insights from Nanjing. Cities 35, 114–124 (2013)

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by the National Natural Science Foundation of China (No. 71901188), and the Research Grants Council Theme- based Research Scheme (No. T32-101/15-R). The authors would like to thank Professor Jian Ma from Southwest Jiaotong University for providing the Shenzhen metro AFC data.

Author information

Authors and Affiliations

Authors

Contributions

YH: Original idea, Literature Search and Review, Data Collection and Analysis, Manuscript Writing. YZ: Modelling, Content planning, Data Analysis, Manuscript Editing. KLT: Content planning, Manuscript editing.

Corresponding author

Correspondence to Yang Zhao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Summary of literature review on direct demand models for ridership prediction

The related studies on direct demand models for ridership prediction are summarized as following (Figs. 15 and 16).

Fig. 15
figure 15

Summary of literature review on direct demand models for ridership prediction

Fig. 16
figure 16

Summary of literature review on direct demand models for ridership prediction (contd) (Cardozo et al. 2012; Gutiérrez et al. 2011; Choi et al. 2012; Cervero 2006; Chu 2004; Walters and Cervero 2003; Kuby et al. 2004; Sohn and Shim 2010; Loo et al. 2010; Sung and Oh 2011; Thompson et al. 2012; Guerra et al. 2012; Zhao et al. 2013; Chan and Miranda-Moreno 2013; Singhal et al. 2014; Liu et al. 2016; Pan et al. 2017; Jun et al. 2015; Zhang and Wang 2014; Taylor et al. 2003; Estupiñán and Rodríguez 2008; Deng and Xu 2015; Li et al. 2016; Hu et al. 2016)

Definitions of stations’ identifiers

For the sake of convenience in representing and understanding, we use alphanumeric code instead of Chinese to denote each station name. Here, we define identifiers for station names according to the following rules: (1) non-transfer stations consist of 3 digits, where the first digit denotes the line number, and the rest 2 digits denote the sequential number of station; (2) transfer stations start with character t followed by 3 digits, where the first 2 digits denote the intersection of two lines, and the last digit means the sequential number of intersections between those two lines. For example, “402” represents the 2nd station of Line 4, and “t131” represents the transfer station that is the first intersection of lines 1 and 3. In this way, all of 118 stations can be encoded by such identifiers containing the line and station information literally (Figs. 11 and 13).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, Y., Zhao, Y. & Tsui, K.L. An adapted geographically weighted LASSO (Ada-GWL) model for predicting subway ridership. Transportation 48, 1185–1216 (2021). https://doi.org/10.1007/s11116-020-10091-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11116-020-10091-2

Keywords

Navigation