Abstract
This study proposed a machine learning-based classification method to accurately predict mode choice in response to potential strategies for transit promotion in a sprawling region. The method consists of a machine learning classifier, a genetic feature selection process, and statistical analysis process. The Random Undersampling Boosting Algorithm is adopted for imbalanced datasets in sampling. The genetic algorithm is applied to optimize the combination of independent variables grounded on the principle of maximum relevance and minimum redundancy. The 2017 National Household Travel Surveys and the add-on samples data for the Houston metropolitan statistical area in Texas, USA, were utilized to build the mode choice classifier, which shows 99.22% classification accuracy for auto mode and 98.90% for transit mode. Based on a comprehensive study of commuters’ trip characteristics and socio-demographics of the study region, bus transit network expansion and bus rapid transit strategies were proposed to stimulate the predominant single occupancy vehicle mode to be shifted to public transit. Results show that the bus rapid transit, providing higher trip speeds for medium- and long-distance commuters, can significantly increase transit mode share by 8.24% and 8.95%, respectively. When the bus rapid transit is available to all the medium- and long-distance commuters, the total mode shift can increase to 15.96% in the study region. The walking distance to the nearest transit access is linearly associated with the mode shift to transit; up to 2.4% of current auto trips shifted to transit mode for those within a 5-min walking distance in the urban area.
Similar content being viewed by others
References
Chen J, Li S Mode choice model for public transport with categorized latent variables, Mathematical Problems in Engineering, 2017 article ID 7861945, page 11, https://doi.org/https://doi.org/10.1155/2017/7861945
U.S. Census Bureau (2019). American Community Survey 1-year estimates. Retrieved from Census Reporter Profile page for Dallas--Fort Worth--Arlington, TX Urbanized Area http://censusreporter.org/profiles/40000US22042-dallas-fort-worth-arlington-tx-urbanized-area/
U.S. Census Bureau (2019). American Community Survey 1-year estimates. Retrieved from Census Reporter Profile page for Detroit, MI Urbanized Area http://censusreporter.org/profiles/40000US23824-detroit-mi-urbanized-area/
U.S. Census Bureau (2019). American Community Survey 1-year estimates. Retrieved from Census Reporter Profile page for Houston, TX Urbanized Area http://censusreporter.org/profiles/40000US40429-houston-tx-urbanized-area/
U.S. Census Bureau (2019). American Community Survey 1-year estimates. Retrieved from Census Reporter Profile page for Indianapolis, IN Urbanized Area http://censusreporter.org/profiles/40000US41212-indianapolis-in-urbanized-area/
U.S. Census Bureau (2019). American Community Survey 1-year estimates. Retrieved from Census Reporter Profile page for San Antonio, TX Urbanized Area http://censusreporter.org/profiles/40000US78580-san-antonio-tx-urbanized-area/
U.S. Census Bureau (2019). American community survey 1-year estimates. Retrieved from Census Reporter Profile page for Las Vegas--Henderson, NV Urbanized Area http://censusreporter.org/profiles/40000US47995-las-vegas-henderson-nv-urbanized-area/
Ewing R, Hamidi S, Grace JB (2016) Urban sprawl as a risk factor in motor vehicle crashes. Urban Studies 53(2):247–266. https://doi.org/10.1177/0042098014562331
Du J, Li Q, Qiao F, Yu L (2018) Vehicle emission estimation on mainline freeway under isolated and integrated ramp metering strategies. Environ Eng Manag J 17(5)
Han Y, Li W, Wei S, Zhang T (2018) Research on passenger’s travel mode choice behavior waiting at bus station based on SEM-logit integration model. Sustainability 10(6):1996. https://doi.org/10.3390/su10061996, www.mdpi.com/journal/sustainability
Lee D, Derrible S, Pereira FC (2018) Comparison of four types of artificial neural network and a multinomial logit model for travel mode choice modeling. Transp Res Rec 2672(49):101–112
Bentz Y, Merunka D (2000) Neural networks and the multinomial logit for brand choice modelling: a hybrid approach. J Forecast 19(3):177–200
Hagenauer J, Helbich M (2017) A comparative study of machine learning classifiers for modeling travel mode choice. Expert Syst Appl 15(78):273–282. https://doi.org/10.1016/j.eswa.2017.01.057
Yen BT, Mulley C, Tseng WC (2018) Inter-modal competition in an urbanised area: Heavy rail and busways. Res Transp Econ 69:77–85
Masoud N, Nam D, Yu J, Jayakrishnan R (2017) Promoting peer-to-peer ridesharing services as transit system feeders. Transp Res Rec 2650(1):74–83
King G, Zeng L (2001) Logistic regression in rare events data. Political Anal 9(2):137–163
Wang F, Ross CL (2018) Machine learning travel mode choices: comparing the performance of an extreme gradient boosting model with a multinomial logit model. Transp Res Rec 2672(47):35–45
Ahmad MW, Mourshed M, Rezgui Y (2018) Tree-based ensemble methods for predicting PV power generation and their comparison with support vector regression. Energy 1(164):465–474
Zhao X, Yan X, Yu A, Van Hentenryck P (2020) Prediction and behavioral analysis of travel mode choice: a comparison of machine learning and logit models. Travel Behav Soc 1(20):22–35
Cheng L, Chen X, De Vos J, Lai X, Witlox F (2019) Applying a random forest method approach to model travel mode choice behavior. Travel Behav Soc 1(14):1
Xie C, Lu J, Parkany E (2003) Work travel mode choice modeling with data mining: decision trees and neural networks. Transp Res Rec 1854(1):50–61. https://doi.org/10.3141/1854-06
Garc´ıa V, Mollineda R, S´anchez J (2008) On the k-nn performance in a challenging scenario of imbalance and overlapping. Pattern Anal Appl 11:269–280
Weiss GM, Provost F (2003) Learning when training data are costly: the effect of class distribution on tree induction. J Artificial Intell Res 19:315–354
Japkowicz N, Stephen S (2002) The class imbalance problem: a systematic study. Intell Data Anal 6:429–449
Haixiang G, Yijing L, Shang J, Mingyun G, Yuanyue H, Bing G (2017) Learning from class-imbalanced data: review of methods and applications. Expert Syst Appl 1(73):220–239
Krawczyk B., Wo ´zniak M, Schaefer G (2014) Cost-sensitive decision tree ensembles for effective imbalanced classification. Appl Soft Comput 14:554–562
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artificial Intell Res 1(16):321–357
Gong J, Kim H (2017) RHSBoost: Improving classification performance in imbalance data. Comput Stat Data Anal 1(111):1–3
Tong T, Ledig C, Guerrero R, Schuh A, Koikkalainen J, Tolonen A, Rhodius H, Barkhof F, Tijms B, Lemstra AW, Soininen H (2017) Five-class differential diagnostics of neurodegenerative diseases using random undersampling boosting. Neuroimage Clin 15:613–624
Pan Y, Chen S, Qiao F, Ukkusuri SV, Tang K (2019) Estimation of real-driving emissions for buses fueled with liquefied natural gas based on gradient boosted regression trees. Sci Total Environ 10(660):741–750
Khattak ZH, Magalotti MJ, Miller JS, Fontaine MD (2017) Using new mode choice model nesting structures to address emerging policy questions: a case study of the Pittsburgh central business district. Sustainability 9(11):2120
Ha J, Lee S, Ko J (2020) Unraveling the impact of travel time, cost, and transit burdens on commute mode choice for different income and age groups. Trans Res Part A 1(141):147–166
Yıldırım MS, Karaşahin M, Gökkuş Ü (2021) Scheduling of the shuttle freight train services for dry ports using multimethod simulation-optimization approach. Int J Civ Eng 19(1):67–83
Abulibdeh A (2018) Implementing congestion pricing policies in a MENA Region City: Analysis of the impact on travel behaviour and equity. Cities 1(74):196–207
Li X, Chen H, Shi Y, Shi F (2019) Transportation equity in China: Does commuting time matter? Sustainability 11(21):5884
Transit Development Corporation, Planners Collaborative, Inc, Tom Crikelair Associates, United States. Federal Transit Administration, Transit Cooperative Research Program. Elements Needed to Create High Ridership Transit Systems. Transportation Research Board, 2007; Washington, DC: The National Academies Press. https://doi.org/https://doi.org/10.17226/23175
Kumar BA, Prasath GH, Vanajakshi L (2019) Dynamic bus scheduling based on real-time demand and travel time. Int J Civ Eng 17(9):1481–1489
Özgür-Cevher Ö, Altintasi O, Tuydes-Yaman H (2020) Evaluating the relation between station area design parameters and transit usage for Urban rail systems in Ankara, Turkey. Int J Civ Eng 2020(18):951–966
Xu J, Yang K, Shao YM (2018) Ride comfort of passenger cars on two-lane mountain highways based on tri-axial acceleration from field driving tests. Int J Civ Eng 16(3):335–351
Li Z, Fu R, Wang C, Stoffregen TA (2020) Effects of linear acceleration on passenger comfort during physical driving on an Urban Road. Int J Civ Eng 18(1):1–8
Owais M, Ahmed AS, Moussa GS, Khalil AA (2020) Integrating underground line design with existing public transportation systems to increase transit network connectivity: case study in Greater Cairo. Expert Syst Appl 2:114183
Yang A, Wang B, Huang J, Li C (2020) Service replanning in urban rail transit networks: Cross-line express trains for reducing the number of passenger transfers and travel time. Trans Res Part C 1(115):102629
Zhang X, Li L, Afzal M (2019) An optimal operation planning model for high-speed rail transportation. Int J Civ Eng 17(9):1397–1407
O’Sullivan S, Morrall J (1996) Walking distances to and from light-rail transit stations. Transp Res Rec 1538(1):19–26
Zhao F, Chow L, Li M, Ubaka I, Gan A (2003) Forecasting transit walk accessibility: Regression model alternative to buffer. Transp Res Rec 1835:34–41
Daniels R, Mulley C (2013) Explaining walking distance to public transport: The dominance of public transport supply. J Trans Land Use 6(2):5–20
U.S. Department of Transportation - Federal Highway Administration, “Pedestrian Safety Guide for Transit Agencies”, Report No. FHWA-SA-07–017, 2008 February. https://safety.fhwa.dot.gov/ped_bike/ped_transit/ped_transguide/ch4.cfm
Ding C, Wang D, Liu C, Zhang Y, Yang J (2017) Exploring the influence of built environment on travel mode choice considering the mediating effects of car ownership and travel distance. Trans Res Part A 1(100):65–80
Van Essen M, Thomas T, Chorus C, Van Berkum E The effect of travel time information on day-to-day route choice behavior: evidence from a real-world experiment. Trans B 7(1):1719–1742. https://doi.org/10.1080/21680566.2019.1699198
Verplanken B, Walker I, Davis A, Jurasek M (2008) Context change and travel mode choice: combining the habit discontinuity and self-activation hypotheses. J Environ Psychol 28(2):121–127
Dahlstrom W (2013) Chapter 4 Zoning regulations in Texas”, American planning association, Texas Chapter, 2013 Oct 04. Issue: 2013: A Guide to Urban Planning in Texas Communities, 2013. https://txplanningguide-ojs-utexas.tdl.org/txplanningguide/index.php/tpg/article/view/39
U.S. Census Bureau. (2020). LEHD Origin-Destination Employment Statistics (2002–2018). Washington, DC: U.S. Census Bureau, Longitudinal-Employer Household Dynamics Program, accessed on Feb 18, 2021 at https://onthemap.ces.census.gov. LODES 7.5 [version]
Federal Highway Administration. (2017). 2017 National Household Travel Survey, U.S. Department of Transportation, Washington, DC. Available online: https://nhts.ornl.gov
U.S. Census Bureau (2019). American Community Survey 1-year estimates. Retrieved from Census Reporter Profile page for Houston-The Woodlands-Sugar Land, TX Metro Area <http://censusreporter.org/profiles/31000US26420-houston-the-woodlands-sugar-land-tx-metro-area/
Brownlee J (2018) Better deep learning: train faster, reduce overfitting, and make better predictions. Mach Learn Mastery
Radovic M, Ghalwash M, Filipovic N, Obradovic Z (2017) Minimum redundancy maximum relevance feature selection approach for temporal gene expression data. BMC Bioinform 18(1):1–4. https://doi.org/10.1186/s12859-016-1423-9
Ludwig O, Nunes U (2010) Novel maximum-margin training algorithms for supervised neural networks. IEEE Trans Neural Netw 21(6):972–984
Chen J, Yang X (2007) Optimal parameter estimation for Muskingum model based on Gray-encoded accelerating genetic algorithm. Commun Nonlinear Sci Numer Simul 12(5):849–858. https://doi.org/10.1016/j.cnsns.2005.06.005
Blackard JA, Dean DJ (1999) Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables. Comput Electron Agric 24(3):131–151
Gorman K, Bayer K Downtown 2018 commute survey, Central Houston, https://www.centralhouston.org/downtown-advantage/
U.S. Department of Transportation—Federal Highway Administration, “Income-Based Equity Impacts of Congestion Pricing—A Primer”, Report No. FHWA-HOP-08–040, Dec 2008, https://ops.fhwa.dot.gov/publications/fhwahop08040/cp_prim5_03.htm
Gao J., Zhao P., Zhuge C., Zhang H., McCormack E.D., “Impact of transit network layout on resident mode choice”, Mathematical Problems in Engineering, 2013 Jan 1; Vol. 2013, Article ID 452735, http://dx.doi.org/https://doi.org/10.1155/2013/452735
Author information
Authors and Affiliations
Contributions
The authors confirm contribution to the paper as follows: study conception and design: QL, AM; data collection: QL, AM; analysis and interpretation of results: QL, AH; draft manuscript preparation: QL, AH, FQ, AM. All authors reviewed the results and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Disclaimer
The opinions, findings, and conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of authors’ agencies and institution.
Rights and permissions
About this article
Cite this article
Li, Q., Huerta, A.R., Mao, A.C. et al. Using Random Undersampling Boosting Classifier to Estimate Mode Shift Response to Bus Local Network Expansion and Bus Rapid Transit Services. Int J Civ Eng 19, 1127–1141 (2021). https://doi.org/10.1007/s40999-021-00635-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40999-021-00635-7