Abstract

Queue forming behind a bus stop on an urban street is common and a traffic bottleneck usually occurs around the bus stop area. The bus stop failure means arriving buses cannot move into the bus stop due to limited capacity but have to wait for available loading areas. It is related with the transit operation level. Traditionally, the failure rate (FR), defined as the percentage of buses that arrives at the bus stop to find all loading areas occupied, is adopted in bus capacity analysis. However, the concept of FR is unable to quantitatively analyze failure characteristics in terms of its dispersion and uncertainty over time. Therefore, in this paper, we propose a new index called failure duration rate (FDR) to evaluate the bus stop failure, which can characterize waiting time for traffic delay calculation and capacity drop estimation. The automatic vehicle location data at eight bus stops in Wujiang District Suzhou, China, over 56 working days, are used to analyze the temporal characteristics of FR and FDR. We next examined the failed service duration characteristics during peak hours at the eight bus stops. Based on these characteristics analyses, we then proposed a Distribution Fitting and Cumulative Distribution Correlation (DF-CDC) approach to explore the correlation between FDR and FR at the same cumulative distribution function levels and validated the bus stop failure performance using the cross-validation method. The analysis results revealed that (i) FR fluctuates more significant than FDR, (ii) FDR is a more robust index than FR in describing the traffic characteristics incurred by bus stop failures, and (iii) FDR performs better in failure characteristics analysis than FR.

1. Introduction

A bus stop serving a large number of bus lines can experience a condition known as bus stop failure due to limited capacity and high passenger demand, which will negatively affect the punctuality and reliability of bus service and also bring about delays to other traffic. The more frequently the bus stop failure takes place, the lower the transit system level of service (LOS) is [1]. However, irregular traffic flow characteristics concerned with bus stop failure are difficult to be captured and quantified because of its dispersion and uncertainty over time [2]. Without a doubt, bus stop failure will significantly affect traffic characteristics at bus berths and adjacent lanes. Typically, the failure rate is proposed for analyzing and evaluating the influence of bus stop failure at bus berths.

Bus stop constitutes one potential bottleneck to interrupt traffic flow, which will deteriorate the level of transit operational service [3]. Transit operation parameters of buses served at loading areas, including dwell time [4, 5], headway [6, 7], capacity [8, 9], queue length [10, 11], and bunching characteristics [12, 13], are analyzed for evaluating the impact of bus stop failures. The failed service will increase bus waiting time for passenger boarding and alighting, and its impact can be measured by an index called failure rate , which is defined as the percentage of buses that arrive at stops to find all berths are full [1]. Wang et al. analyzed the correlation between failure rate and four kinds of transit dwell and arrival characteristics and proposed a diffusion approximation method [14]. As one of the desired level indexes of transit operation, the FR could assess the change of capacity and LOS of bus berths. Failure probability and dwell time variability can develop the function of bus queue length, which can better reflect the effects on bus stop capacity [15]. A parameter “Z” associating with the desired failure rate (under the assumption of standard normal distribution fitting) is developed to account for the fluctuation in bus dwell time in bus loading areas, and the design failure rates for urban and rural areas are recommended for estimating bus berth capacity [1]. The failure rate of curbside bus stops can be influenced by a serial of factors (such as bus arrival distribution type, bus arrival rate, bus berth maximal service rate, and bus service time variation), and the normalized capacity and incremental change (for multiberth stops) in capacity at different failure rate levels are proposed [8]. Moreover, the analysis process of failure rate is a poor proxy and suggests choosing the average waiting delay for evaluating the bus berth LOS is expressed comprehensively [16].

The data associated with vehicle location are utilized for analyzing and predicting traffic flow characteristics, which can be effectively applied to transit operational characteristics analysis [1720], bus schedule optimization design [21, 22], bus lane planning and control strategy [2325], and transport network flow estimation [2628]. Several findings of characteristics analysis for public transit LOS at bus stop appear in the relevant literature. A data platform for monitoring patterns of bus operation is developed, which is primarily composed of data acquired from the ITS system in Beijing, China. A multilevel framework for transit performance analysis is proposed considering several transit operational factors [29]. Based on automated vehicle location (AVL) data, lots of statistical parameters about travel times are analyzed for evaluating the performance of bus routes with transit priority facilities, and these tests indicate spatial and temporal characteristics are the most potent feature [30]. A regression method (using LS-SVM) is developed for exploring bunching patterns of buses halting at the stop area, and the headway irregularity pattern is analyzed using transit smart card data [31]. A probabilistic method considering the interference between buses, using the loading areas, is established for predicting travel time of buses using trajectory and ID card data, which can reflect buses’ dwell time distribution pattern well [32].

Although several findings of bus stop failure analysis using the failure rate appear in the relevant literature, the duration time of bus stop failure is rarely mentioned. Additionally, little research has been observed using AVL data to analyze the characteristics of bus stop failure. In this paper, failure duration time is utilized for evaluating bus stop failure, and a measure called failure duration rate is proposed for failure analysis utilizing collected transit automated vehicle location (AVL) data. The characteristics analysis of bus stop failure using AVL data can provide valuable information for transit operation optimization to the public transit authority.

The remainder of this research is organized as follows. In Section 2, a characteristic index called Failure duration rate is developed for bus stop failure characteristic evaluation. Section 3 explores the failed duration characteristics at different failure levels based on the AVL data collected from eight bus stops in Wujiang District of Suzhou in China. In Section 4, a correlation analysis between FDR and FR is carried out by using a “Distribution Fitting and Cumulative Distribution Correlation (DF-CDC)” analysis approach. Section 5 concludes the paper.

2. Characteristic Analysis Indexes

2.1. Failure Rate

A bus stop failure occurs when a bus arrives at the loading areas but with no available berth to use. The failure rate , defined as the percentage of bus queuing for moving into bus berths occupied by other dwelling buses [14], could be formulated bywhere is the number of buses halted at a bus stop and is the number of bus berths.

In general, transit vehicles’ dock at a bus stop (including the served and waiting buses) will obey the first-in-first-out rule and usually disperse in an independent manner to each other. The probability for the case without adequate berths at a bus stop could be calculated by

When the berths at a bus stop are all occupied by buses for passenger boarding and alighting, the number of active buses served at the bus stop is equal to the number of berths . And the probability of this kind of bus arrival can be approximated by the value of divided by :where is the occurrence times for during a given observation period and is the total number of arriving buses during the same time duration.

Then, we have

To facilitate our presentation, we denote the right hand of equation (4) by

In other words, stands for the of a bus stop, which reflects the level of failure (LOF) for bus loading areas.

2.2. Failure Duration Rate

As the berths of a bus stop are occupied, the next arriving bus needs to queue at street lanes and exerts negative impacts on blocking movements of other vehicles along the same street lane. As a result, traffic delay goes up, and travel time reliability would be reduced [1]. It is an important and challenging task to analyze these adverse effects quantitatively. In general, the longer the bus stop failure service lasts, the worse the traffic efficiency evolves and deteriorates. It is worth pointing out that the severity of traffic deterioration in terms of traffic delay and road capacity reduction dramatically depends on the traffic blocking duration time.

Thus, the failure duration rate that incorporates failure duration time is proposed for analyzing the LOF of bus stops. The failure duration time can be measured by the timespan (the waiting time of buses outside the stop) for all failed bus stopping services during a given time period. Specifically, it can be calculated by examining the arrival and departure time of buses using the loading areas of a bus stop. Then, the failure duration rate can be formulated bywhere is the failure duration time (sec), represents the total occupancy time of using the bus stop (sec), and is the duration time for a vacant bus stop (sec).

The can be interpreted as the ratio of waiting and blocking for arriving buses during a given time period. For a specific bus stop failure, the average duration time per failure can be formulated as

Then, the average duration rate per failure is obtained as

3. Characteristic Analysis

3.1. Data Collection

In this study, bus dwelling data are based on the AVL data provided by Wujiang Transit Agency in Suzhou, China. The AVL data span 56 consecutive working days from October 22, 2018, to January 9, 2019. The dataset of each day has a half-day bus dispatching time window, from 7:00 to 19:00, and there are nine bus routes and eight bay-type bus stops (see their geographic locations in Figure 1). These eight test bus stops keep considerable distances to intersections (the average distance of 200 m), and thus the interaction between the bus bay and nearby intersections would be negligible.

The details of the nine bus routes associated with each bus stop are given in Table 1, and the headways of these transit routes range from 8 to 15 minutes.

The dwelling time of buses serving these 9 routes at the 8 bus stops are extracted from the collected AVL data. Some records are provided in Table 2. Based on the arrival and departure time of buses boarding and alighting passengers at the third bus stop, we can determine the bus failure characteristics. For example, the bus with the ID of SU-EU9353 serving Route 710 departed from the bus stop at 16:14:28, while the bus with the ID of SU-EU6029 serving Route 741 arrived at the same stop at 16:15:07, and the bus arriving later needed to wait outside the stop for 13 seconds (failure duration time) until bus no.710 left the stop.

3.2. Temporal Characteristics Analysis

The number and duration of buses stop failure (per hour) at bus stops are determined based on AVL data. We here use equations (5) and (6) to calculate the hourly and of the eight bus stops. Figure 2 plots the hourly time-varying characteristics of and over 56 working days of the No. 1 bus stop (672 hourly data and data in total). It can be observed that and in the morning (7:00–9:00) and evening peak hours (17:00–19:00) are higher than those in nonpeak hours. Overall, the mean value of is more significant than that of in most times.

We then look at the median value of the hourly and . As we can see in Figure 3, there are highly similar patterns of and during peak hours for all the eight bus stops. The reason might be that there is high travel demand in both passengers and vehicles are in at peak hours, and thus bus stop failures occur more frequently, especially because of a long time for passengers to board and alight.

3.3. Failure Duration Analysis

Figure 4 shows the relationship between and . For each bus stop, 224 data during the morning and evening peak hours (7:00 to 9:00 and 17:00 to 19:00) are considered.

From Figure 4, the hourly shows a weak positive correlation to the for all the eight bus stops. Overall, the fitting parameters between and vary remarkably among bus stops sites. For example, there are the lowest R-square (0.31) for no.1 bus Stop and the highest R-square (0.661) for no. 4 bus stop. In addition, the statistical relationship between and of 1-berth stops (no.1 through no.6 bus stops) is weaker than that of the two 2-berth stops (no.7 and no.8 bus stops).

As mentioned above, it is difficult to determine a well-fitted failure duration rate function (for the 8 bus stops) using failure rate directly due to the dispersion. Taking no.1 bus stop as an example, is increased from 13.3% to 33.0% as changes from 24% to 25%. But increases from 16.7% to 43.3% as increases from 24% to 25%. It indicates the significant dispersion. To decrease the dispersion of failure duration rate in analysis, the failure rate of bus stops is divided into sections with an interval length of 5%. is defined as the 5% section level of failure rate at a bus stop, which range from 5% to 10% . Then, the mean, standard deviation (S.D.), and coefficient of variation (C.V.) of failure duration rate for these 8 bus stops are clustered and calculated at a different level of failure rate. The mean, S.D., and C.V. of failure duration rates at different failure rate level of these eight test bus stops are compared in Table 3. When the frequency of bus stop failure is less than 5 at a level, the failure duration rate is not calculated and identified as “not available (N/A)” in our analysis. Because the span of these 8 bus stops is different, each bus stop has some “not available (N/A)” at a corresponding failure rate level.

Based on the mean, S.D., and C.V. of failure duration rates at different failure rate level, we then analyze the failure duration characteristics. Figure 5 presents the average value (for all these eight bus stops) of mean, S.D., and C.V. for failure duration rate at different failure rate levels. The mean and S.D. of increase with the increasing failure rate level, and the mean has a higher and faster growth rate than that of S.D. The mean of is increased from 5.2% to 27.5% as failure rate level increases from 5% to 40%. And the S.D. of is increased from 3.1% to 5.5% as the same span of failure rate level. However, the favorable trend does not hold for the C.V. of , as illustrated in Figure 5. The C.V. of is decreased from 59.6% to 19.9% as failure rate level changes from 5% to 40%.

Figure 6 displays the average value of the mean, S.D., and C.V. for failure duration rate of six 1-berth stops and two 2-berth stops. The average value of the mean, S.D., and C.V. for failure duration rate of the single-berth stops is more significant than that of the 2-berth stops.

Figure 7 displays the correlation relationships between per failure and the for the eight bus stops. It is shown that the is insensitive to . For the no.1 bus stop, Figure 7(a) reveals that is increased from 1.5% to 5.4% and from 1.9% to 3.4% when the climbs up from 21% to 22% and from 34% to 35%, respectively. Thus, the span of is significant at the low level of failure rate, and the volatility of becomes more and more slight with the increasing failure rate. Besides, the diversity correlation relationship between and is influenced by the number of berths of loading areas. Figure 7 shows that is increased from 1.2% to 3.8% and from 1.5% to 5.4%, respectively, as the failure rate increase from 21% to 22% for no.7 bus stop (2-berth type) and no.1 bus stop (1-berth type). Therefore, compared with single-berth bus stops, the two 2-berth bus stops have less at the same level of failure rate.

For the different levels of failure rate (range from 5% to 40%), the value of the mean, S.D., and C.V. for at the bus stops are calculated and presented in Figure 8. The average values of the mean, S.D., and C.V. for reveal a significant negative correlation with the level. When the FR ranges from 30% to 40%, the three statistics for are not very sensitive to the FR level, and there are no obvious fluctuations. For the mean value of , the maximum variation is merely 0.1% when the FR level falls into the range of 30% to 40%. Therefore, the dispersion of decreases sharply with the level (especially when the FR is greater than 30%), which implies that the average FDR per failed is comparatively stable. In Figure 9, there is a similar trend of C.V.

For the different levels of failure rate (range from 5% to 40%), the value of the mean, S.D., and C.V. for for six 1-berth stops and two 2-berth stops are calculated and depicted in Figure 9. The three statistics for at different failure rate levels for the single-berth stops is larger than that of the 2-berth stops, which shows higher stability at the 2-berth bus stops.

4. Correlation Analysis

As discussed in Section 3, it is found that it is not so easy to establish a satisfactory relationship between the and , , via linear regression models. In this section, we propose a “Distribution Fitting and Cumulative Distribution Correlation (DF-CDC)” method for an in-depth and more reasonable analysis of the correlation between and .

4.1. Distribution Fitting

The distribution fitting analysis is regarded as a useful approach to mine characteristics of transit operational parameters from the probabilistic perspective [33]. A unified probability distribution is explored that can well fit the essence of and , utilizing the probability and statistical analysis methods. That is, we aim to understand how well a candidate distribution is fitted with predicted parameters for and . Typically, chi-squared (), Kolmogorov–Smirnov (K-S), and Anderson–Darling (A-D) tests could be used for assessing the goodness-of-fit of our analysis results. In this paper, the K-S test statistic at a significance level of 0.05 is adopted for the goodness-of-fit test based on the data of 8 bus stops on peak periods per workday (224 data per stop). 36 probability distributions listed in Table 4 are chosen for hypothesis analysis. Table 4 shows the number of rejections for the 36 possible candidates. The results reveal that the distributions of Error, Gen. Extreme Value, Gen. Logistic, Logistic, and Normal could be selected as the candidate distributions for the correlation analysis between and .

The five well-fitted candidate probability distributions (Error distribution, Gen. Extreme Value distribution, Gen. Logistic distribution, Logistic distribution, and Normal distribution) are redeemed for analyzing the goodness-of-fit of fitted and at the bus stops. After estimating the parameters of these distributions (using the Probability density function for distributions shown in Table 5), K-S test results ( value) for the five candidate probability distributions are plotted in Figure 10. It can be seen that the goodness-of-fit measured by values for fitted distribution is much better than that for the .

In Table 5, the means of value for and at the eight bus stops for the five candidate distributions are also given. Gen. Extreme Value distribution is the best one in terms of value (with 0.68224 of and 0.87865 of ) in distribution fitting for the and .

4.2. Cumulative Distribution Correlation

The probability density function of Gen. Extreme Value distribution is utilized for fitting the hourly and distributions for the eight bus stops. The results are provided in Table 6.

Based on the calculated parameters of the fitted Gen. Extreme Value distribution in Table 6, the cumulative distribution function (CDF) curve of and for these test bus stops can be determined. From the fitted CDF curve, the fitted value at a different level of CDF can be recorded. The fitted value at a different level of CDF can be recorded using the fitted CDF curve, and the actual value can be determined by analyzing the ranking level based on the sorted 224 data collected at each test bus stop. For analyzing the accuracy of the fitted CDF value, the relative error between the actual and fitted value of and for test bus stops at a different level of CDF is examined. Furthermore, 17 critical levels of CDF, ranging from 10% to 90% (with 5% of interval length), are selected for verifying. The relative error between the actual and fitted value of and for the critical CDF level at the 8 test bus stops are presented in Tables 7 and 8.

Expect for a tiny minority of critical CDF level, relative errors between the actual and fitted value of and are always less than 10% at these test bus stops, as shown in Tables 7 and 8. Therefore, Gen. Extreme Value distribution performs well in fitting CDF value for and at these test bus stops, and the accuracy and reliability for the fitted value are convincing.

For the distribution pattern of and at test bus stops, Gen. Extreme Value distribution can be well fitted. Moreover, they also have a positive correlation relationship (as shown in Figure 4). Therefore, it can be considered that the fitted value of and at the same CDF level perform equivalently in failure characteristics analysis. Figure 11 presents the fitted curves and critical level values of and , which are fitted and utilized Gen. Extreme Value distribution for no.1 bus stop. In Figure 11, the “star (with pink color)” and “triangle (with red color)” display the fitted values of and at these 17 critical CDF levels, and the “star” and “triangle” connected by a dotted line is defined as an equal correlation pair of and for the corresponding critical CDF level.

Figure 12 reveals the relationship for the 136 equal correlation pair of and , including the pairs for 17 critical CDF levels at 8 test bus stops. A quite strong linear regression expression (with 0.98 of R-square) for the fitted couples of and can be observed, which can reflect the significant correlation relationship between fitted and .

4.3. Correlation Performance Evaluation

The crossvalidation method [34] for prediction based on AVL data is adopted for analyzing the correlation between and . A four-step procedure for predicting at a certain cumulative distribution ranking level is illustrated as below. Firstly, the observed and in peak hours of 56 workdays at 75% bus stops of the total 8 test bus stops are selected randomly as the modeling datasets, and the data of the rest two bus stops are defined as predicted datasets. Secondly, the probability density distribution of and (for the selected six bus stops in modeling datasets) are fitted using Gen. Extreme value distribution, and the fitted and are recorded based on their probability density functions at critical cumulative distribution levels, respectively. Thirdly, linear regression expression is developed based on these fitted and for the six bus stops (in modeling dataset) at critical cumulative distribution levels. Finally, the observed (for two bus stops in prediction dataset) at corresponding critical cumulative distribution levels are determined, and the predicted can be calculated using the linear regression model (as formulated in Step 3). The prediction accuracy and reliability of can be determined by comparing the actual and predicted value.

It is obvious that there are 28 different combinations of modeling and prediction datasets, as the rules described in Step 2. There are seven estimated values of for different cumulative distribution levels at each bus stop, calculated from these 28 different division plans. Based on crossvalidation, the predicted value and relative error of for different cumulative distribution levels at 8 test bus stops are calculated, and the results are shown in Figure 13.

In Figure 13, the mean value of predicted relative errors for seven times prediction at different cumulative distribution levels is represented by a solid blue line. According to the results, these test bus stops have more accurate predicted values (the relative error is less than 15%) at most cumulative distribution levels, expect for low cumulative distribution levels (less than 15%). The prediction results of for different cumulative distribution levels at no.1 bus stop perform well (the relative error is not more than 5%) in general. Also, for most of the bus stops (no.2, no.4, no.5, no.6, no.7, and no.8 bus stop), the relative error of predicted is diminished gradually, as the cumulative distribution sort decreases. No. 3 and no. 4 bus stops serve five bus routes, more than other bus stops. In general, the more bus routes are served at a bus stop, the more complicated buses arrival patterns are. Therefore, the average predicted relative error values of no. 3 and no. 4 bus stops will be larger than other bus stops.

To make an easy comparison, the average relative error of for 1-berth bus stops and 2-berth bus stops are plotted in Figure 14. It can be observed that, for both types of stops, the average relative error shows a decreasing trend. In addition, the volatility is insensitive when the cumulative distribution level is higher than 30%, and the average relative error is less than 8% in general. According to the results, the proposed “Distribution Fitting and Cumulative Distribution Correlation (DF-CDC)” can develop a significant correlation relationship between the failure rate and failure duration rate.

5. Conclusion

In order to analyze the characteristics of bus stop failure, we propose a new measurement called and make a comparison to the traditional index of the . Compared with the , the proposed is capable to quantitatively assess the impact of bus stop failure on traffic efficiency. Based on the collected AVL data associated with the eight bus stops in Wujiang District of Suzhou, we make an in-depth analysis of the characteristics and correlation of the and . Some insightful findings are summarized as follows:(i)It can be observed that the values of and in morning and evening peak hours are greater than those during off-peak hours. The value of the is usually larger than that of the across all the eight bus stops.(ii)It is found that there is a positive correlation between the and the . However, the R-square values of the linear regressions fluctuate dramatically among different bus stops. The results also indicate that the is more robust than the in describing transit system status and traffic characteristics.(iii)We find that Gen. Extreme Value distribution could be well used for the fittings of both the and and the proposed “Distribution Fitting and Cumulative Distribution Correlation (DF-CDC)” method works well in determining the fitted values of the and at the critical CDF level that reflects a significant correlation between the and .

Future works could be extended in two aspects. First, more AVL and other source data in other cities could be collected and used for analyzing the failure characteristics of bus stops. Second, based on the data-driven analysis of bus stop failure characteristics, we could make some scenario analysis to find out the most important factors, such as the number of bus stop berths and the number of lanes or the passenger demand.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Authors’ Contributions

The authors confirm contribution to the paper as follows. R. Li and H. Wang conceptualized and designed the study; X. Xue carried out data collection; R. Li, X. Xue, and H. Wang carried out analysis and evaluation; R. Li and X. Xue prepared draft manuscript. All authors reviewed the results and approved the final version of the manuscript.

Acknowledgments

This research was supported by the National Key R&D Program in China (Grant no. 2018YFB1600600), Natural Science Foundation of Jiangsu Province (Grant no. BK20181307), Fundamental Research Funds for the Central Universities of China (Grant no. B200202088), National Natural Science Foundation of China (Grant no. 51508161), and Postdoctoral Science Foundation of China (Grant no. 2018M630505).