Introduction

Automated Taxi (AT) can be considered as an on-demand mobility service, i.e. services that provide travellers with origin–destination transport in a highly or fully Autonomous Vehicle (AV). Despite being a relatively new concept, there is already a great deal of literature on automated on-demand services, but mostly as ride-sharing services (synchronous or dynamic sharing). The majority of this literature focuses on the potential economic, environmental and social benefits of these shared systems, which have been estimated in up to 49% reduction in pollutant emissions (Fagnant and Kockelman 2014), 90% reduction in parking space (Zhang et al. 2015) and 40% reduction in commuting travel cost (Lu et al. 2018). Lokhandwala and Cai (2018) indicated that, maintaining the same level of service, a ride-sharing ATs fleet can potentially halve the fleet size of traditional taxis (about 13,500) in New York City. However, in their agent-based model the rider preference is measured in terms of “Deviation Tolerance”, i.e. the fraction of distance that a rider group considers as an acceptable deviation. The inclusion of time value of money to better modelling the sharing decision making process is acknowledged as further research direction.

There is also a relatively vast literature on the factors influencing the choice of owning and/or using general AVs, (see the research reviews in Becker and Axhausen 2017; Gkartzonikas and Gkritza 2019; Nordhoff et al. 2019), but few papers have dealt with ride-shared AVs. Krueger et al. (2016) studied Australian preferences for dynamic Shared Autonomous Vehicles (SAVs). They used a Stated Choice (SC) experiment with three alternatives (two AVs with and without ride-sharing and the current public transit) and three attributes (travel time, travel cost and waiting time). Respondents were informed that SAV could be imagined as driverless taxi services. Bansal and Daziano (2018) studied willingness to share a ride with strangers using a SC experiment with three alternatives (two Uber modes with and without ride-sharing and the current travel mode) where a dummy variable was used to indicate whether the Uber was with or without driver. Yap et al. (2016) investigated the choice of the egress modes of train trips, where two of the available options were cybercars (driving yourself and automatic driving). Alternatives were described in terms of travel cost, waiting time, travel time and walking time to the destination, plus a dummy variable to indicate if the cybercar was a shared vehicle or not.

Differently from a SAV and a traditional taxi, the AT service ought to be designed to satisfy various taxi passenger requests (such as adjusting the heating or air conditioning, changing the destination and adding a stopover) which in a NT are dealt with a direct communication between the passenger and the driver (Kim et al. 2020). In-vehicle features might highly influence the potential choice of ATs, and this knowledge is very valuable for AT manufacturers and operators for developing ATs. Nonetheless, there is almost no evidence about the impact of in-vehicle features in the choice of using ATs. Nordhoff et al. (2020) investigated users’ perception with respect to the possibility of manually steering an automated shuttle and of having a button inside the automated shuttle which they can press to stop it, while Paddeu et al. (2020) used a naturalistic experiment where respondents were asked to rate the impact of the direction of the seat (backwards/forwards) on comfort and trust on shared automated shuttle.

By nature, ATs are also highly innovative transport systems, and it is known that social influence plays a critical role in explaining individual choices for innovations. Several papers have studied the impact of subjective norms on the acceptance of AV, as part of psychological constructs (Theory of Planned Behaviour or Technology Acceptance Model and its extensions). But these papers mostly focus on owning AV or using different levels of automation. Tussyadiah et al. (2017) studied specifically attitudes toward self-driving taxi, but did not include social influence. Nordhoff et al. (2018) tested the impact of social influence in the context of an automated shuttle asking respondents “whether they would like to have their friends or family or other important people to them adopt the automated shuttle before they themselves do”, and “whether people who are important to them would like it if the respondent used an automated shuttle”.

Papers focusing specifically on the impact of social influence can be found in the Electric Vehicles (EVs) literature. Within this literature, a few papers tested social influence as an attribute in the SC experiments, mostly in terms of social adoption (Kuwano et al. 2012; Rasouli and Timmermans, 2013; Araghi et al. 2014; Kormos et al. 2015; Cherchi 2017). Cherchi (2017) accounted for social conformity (a type of social influence) using different measures: social adoption, self-signalling, injunctive norms (this as a latent effect) and informational conformity. With informational conformity she measured the impact on the choice of EV of positive or negative experience provided by a person close to the respondent who has recently bought an EV. She found that information about a negative experience had a significant impact in reducing the probability of buying an EV while positive information was less significant. Rasouli and Timmermans (2013) tested different adoption rates for different groups (friends, relatives, colleagues and general peers) and also included an attribute to measure the impact of public review, defined as: only positive, mainly positive, mainly negative, only negative (4 levels). They found that negative reviews were not significant, while positive reviews (with no significant difference though between “only positive” and “mainly positive” reviews) had a significant positive impact on the intention to purchase an EV. Evidence from marketing suggest that reviews are critical factors for customer decision making (Vermeulen and Seegers 2009; Mudambi and Schuff 2010; Liu and Park 2015). Customer reviews are important cue to help consumers evaluate the quality of the products to reduce the level of perceived uncertainty before experiencing or purchasing a product (Ye et al. 2011). Zhao et al. (2015) found a significant negative relationship between negative online reviews and hotel booking intentions, while the impact of positive reviews was not statistically significant. Zhu and Huberman (2012) measured how often respondents’ choices changed due to others’ recommendations. They found that “other people’s opinions significantly sway people’s own choices” and the influence was stronger when facing a moderate, as opposed to large, number of opposing opinions. Customer reviews (representing a general public opinion or a form of word-of-mouth), are increasingly used in reality as a form of social influence, but in transport have rarely been studied.

This paper contributes to the limited empirical evidence on ATs by analysing in depth the in-vehicle features that the AT ought to have to be competitive with a NT. The paper also contributes to the body of literature on the impact of social influence on the adoption of innovation and in particular on the choice of ATs, by testing the impact of consumer reviews other than the more common measures of adoption and injunctive norms. A SC experiment is built to test empirically the impact of in-vehicle features and social influences. Particular attention has been put in building the design in a realistic way, to reduce the hypothetical bias typical in the SC experiments, which is even more dangerous when testing innovations. This includes the layout of the SC experiment, the definition of the attributes and the pre-information provided with respect to the AT. Hybrid choice models are estimated, which allow measuring the impact of injunctive norms, other than users' preferences and willingness to trade-off for different AT in-vehicle features as well as standard value of travel and waiting time. T-tests and confidence intervals for the WTP are computed using simulation. Resampling methods are performed to test model sensitivity. The study is conducted in China among current NT passengers (i.e. passengers who use the current taxi service that is operated with a driver). Finally, we note that in this research, we exclusively focus on the car-sharing, not ride-sharing AT services (i.e. asynchronous sharing), as ride-sharing AT services involving the share with strangers might induce other human-related factors influencing customers preferences, which is not specifically considered in this research.

The remainder of this paper is structured as follows: the second section discusses the attributes tested in the SC experiment, while the third section discusses the pre-information provided to the respondents in particular about ATs. The fourth section describes the questionnaire and the data collection process. The fifth section presents the model specification, and the sixth section discusses the estimation results and the WTP results. Finally, the last section concludes the paper.

Attributes in the stated choice experiment

The SC experiment built in this study consists of a binary choice between an AT and a NT. The experiment includes 7 attributes, selected after an extensive literature review, 3 Focus Groups (conducted for the same study but in UK) and 5 pilot tests conducted in China. Three attributes refer to standard level of service attributes (Waiting Time, Travel Time and Fixed Journey Fare), the other 4 attributes have been specifically designed to test specific features available inside the AT (Chat with an operator and Change Destination) and to measure the impact of social influence (Number of Customers and Customer Rating). The 3 level of services attributes are based on the literature review and represent the attributes most used in the SC experiments involving AVs. It is worth mentioning that to avoid uncertainty in the monetary cost, we specified that the taxi journey fare was fixed, i.e. the fare displayed in each scenario does not depend on the actual travel time experienced on-board, even if respondents decide to change destination as allowed in the experiment. We thereby control the radius of travel distance that respondents are allowed to change. For example, in a 10 km trip, participants can change their destinations only within a 10 km radius from the origin.

Figure 1 reports an example of the task presented. A significant effort has been devoted in designing the layout of the task, as we would like to present it in a way that looked as realistic as possible. Different definitions of the attributes and several different layouts were also tested, in particular for the presentation of the in-vehicle features and customer rating. In the pilot tests, respondents were asked to evaluate each element of the task in terms of clarity of the description of the attributes and the pre-information.

Fig. 1
figure 1

Example of choice task presented (translated from Chinese)

In-vehicle AT features

Several in-vehicle features defining the safety/communication options available inside the AT were discussed and tested extensively in the Focus Groups (FGs) first and then in pilot tests. In particular, we tested (1) in-vehicle communication forms with the AT operator, (2) car conditions (cleanliness, age, model, brand), and (3) social interaction with the driver/AT operator (to communicate the destination, to get the price, to simply chat and to pay). Regarding the communication forms, during the FGs participants were presented with the options in Fig. 2 and were asked to indicate which form of communication they prefer, and a discussion was then open on the reasons of their choice. The presence of a button was considered generally relevant, while, surprisingly participants expressed concerns about relying only on the app, as well as on the reliability of voice control (e.g. recognising different accents etc.). Several participants asked why all three options could not be used, which seems to reflect some anxiety about being in a car without a driver.

Fig. 2
figure 2

In-vehicle communication forms with the AT operator

Car conditions were considered relevant but not top of the list. Moreover, most of the car conditions are not specific features of the AT, hence less relevant for this study. The model of the AT was discussed extensively during the FG, because autonomous cars can be like normal cars, or having a distinctive model. The FGs highlighted a preference for normal models. This attribute was then not included in the SC experiment. Finally, the discussion in the FG also confirmed that lack of human interaction, i.e. the possibility to interact with the driver was a relevant factor. In particular elderly members of the FG highlighted that in the AT they would have missed even just chatting with the driver during the trip, even if they have no specific request to the taxi driver (“I think one disadvantage of it, when you get a taxi, normally they say, “have you had a good day? … You can talk to them.”). This is a relevant information for the development of the AT service. It is important to know if communication-related devices and an operator are needed and to which extent the lack of this in-vehicle feature might affect the use of the AT services. The second in-vehicle feature tested is the possibility to change destination. While ‘changing trip destination’ (even simply drop off a bit earlier or in a particular point closed to the destination) is very common and easy to perform in a NT, the lack of driver can potentially limit flexibility during the trip. This is an important feature to test. As illustrated in Fig. 1, the two attributes describing in-vehicle features were presented as on/off options. A green button with YES indicates that the feature is available for that specific taxi; a red button with NO indicates that the feature is not available. In the example illustrated in Fig. 1, in the AT option respondents would have the possibility to chat with an operator but would not have the possibility to change the trip destination after they have made their choice of taxi.

Social influence attributes

As mentioned earlier, two attributes were included in the SC experiment to measure social influence: the number of customers and customer rating. A third measure of social influence (injunctive norms) was elicited outside the SC experiment using Likert scale. The use of the adoption rate follows the literature on EV. This descriptive norm, which measures if individual behaviour is affected by what other people do, is in fact the most typical measure of social influence used inside SC experiments. A thorough discussion and several tests were performed to identify the best period of reference for the adoption (namely the number of customers who have used the taxi), as this is quite different in the context of AT compared to EV. Given the nature of the service, i.e. that a taxi can be used every day and more than once a day, a short time reference (todays or shorter) was considered more realistic, because the number of customers change among scenarios, and each scenario represents another day where the respondent takes the taxi. Given these considerations, “the last hour” was chosen as period of reference.

As discussed in the introduction, from the marketing literature there is significant evidence suggesting that reviews are a critical factor for customers decision making. Indeed, this has become also very popular in all websites. However, how customer rating is actually presented is also critical. Cosley et al. (2003) investigated users’ satisfaction, rating consistency, and recommendation accuracy when rating movies under three different scales: a binary scale, a ± 3 scale with no zero, and a five-star rating scale with half-star increments. They found that users like the five-star scale best, and they found evidence suggesting that as scale granularity increases, recommendation accuracy increases. Sparling and Sen (2011) evaluated four types of rating scales and concluded that users prefer the five-star scale overall, although the thumbs scales come in as a relatively close second choice for product reviews. Chen (2017), comparing different rating systems, found that the five-star rating system allows cognitive fit (match between task, problem representation and individual problem-solving skills) which increases perceived information quality and decreases cognitive decision efforts. Finally, the study of Pang and Lee (2005) proves that, within a rating scale of four or five stars and a separation of one star and a half, 100% of the users are capable of discerning the relative difference. Based on these evidences and with the aim to increase realism, we decided to present the customers review using the 5 star system, with no extreme evaluations (we used 2 stars for bad reviews and 4 and a half stars for good reviews). We also specified that these reviews refer to “yesterday” customers, to make it realistic for respondents to see different customer rates in different scenarios, as each scenario represents another day.

Pre-information about ATs

Before presenting the SC tasks, respondents were provided with two types of information: (1) general information about AT safety, privacy and routing, and (2) specific information about how the AT operates once on board. During the FGs, respondents were asked to rank a list of AV-related aspects and safety came always the most important aspect, followed by control of the vehicle and then lack of human contact. The importance of safety is not surprising. It is confirmed in the AV literature, however, people do not have the same knowledge about AV safety and this might affect the result of the SC experiment. It is then important to ensure that all respondents are exposed to the same information before answering the SC scenarios.

Significant effort has been devoted to define the type of information to present and their format. We were conscious that it was important and necessary to provide information to give a common background, but at the same time we strove to present the information as objectively as possible, to avoid affecting positively (or negatively) individual preferences, and as close as possible to how respondents would get the information in real life. Several options were tested asking respondents opinions about (1) media channel (i.e. who reported this information and where it was reported), (2) source of information (i.e. which institutions or organisations investigate and report taxi safety information) and (3) type of safety measure (crash rate, fatalities, injuries by type, total versus relative numbers). The vast majority of respondents indicated the national-level media as the most likely channel to get safety information about ATs and the national-level government agency as the most trustable source of safety information. We thus decided to present the message as news in the CCTV (China Central Television, equivalent to the UK BBC) reporting results provided by the Ministry of Transport of People’s Republic of China, equivalent to the UK Department for Transport (DfT). To define the safety measure, we reviewed the literature on AV (as discussed in the introduction), the literature on WTP to reduce crash (e.g. Rizzi and Ortúzar 2003, 2006), information from Vehicle Safety Report for current AV tests (e.g. Tesla Vehicle Safety Report https://www.tesla.com/VehicleSafetyReport), as well as the results of our FGs and tests. It was finally decided to present the information in terms of “crashes recorded per miles travelled by ATs compared to the NTs. We did not find the Chinese crash rate of NTs. Taxi crash rate was then computed taking as reference the UK’s taxi crash data and travel mileage by taxis,Footnote 1 and adjusting it based on the Chinese crash rate for normal cars. The crash rate for ATs was assumed to be half the crashes in normal taxis. This value has to reflect the scenario of full adoption of AV but it also aims to make respondents feel safe to use ATs. To increase realism, we also carefully designed the layout of the safety information to be the same as the official CCTV news layout.Footnote 2 Figure 3 reports the information provided.

Fig. 3
figure 3

Information provided before the SC experiment: Safety and Privacy

An equally thorough analysis was conducted to identify the specific information about how an AT operates and how to present them to respondents. Figure 4a shows necessary information about how to use ATs, while Fig. 4b shows the information presented about the specific features that respondent would find inside the AT.

Fig. 4
figure 4

Information provided before the SC experiment: How AT operate

The information reported in Fig. 4b are meant to inform respondents in advance about some features that are presented in all ATs (24 h security surveillance camera and a ‘SOS’ button, which respondents in the FG and pilot tests considered necessary) plus some optional features that will be presented in some of the ATs, not all (this reflects what respondents will find in the different SC scenarios).

The majority of the SC experiments involving AVs have opted for detailed descriptions about what it is possible to do within an AV with often images (also virtual reality images) or videos featuring how vehicles can be used (e.g. Krueger et al. 2016; Kolarova et al. 2019). Providing information is particularly important in case of AVs, as these are relatively unknown to the majority of the population, who certainly have no experience with them. In our case, however, we decided not to include this information not any image or video for two main reasons. First, our experiment features a choice only between taxis, then there is no difference in the type of activities that can be performed within a normal (i.e. with driver) and an automated (without driver) taxi. Second, even though images and videos help to familiarise with the innovation, they are rarely neutral and it is likely they have a priming effect on individual preferences.

Questionnaire and data collection process


The questionnaire set up to collect the data included four sections:

Section 1: Screen out, knowledge of AVs and recent trip information. The survey was intended for taxi users and the screen-out question was set up to include only respondents who have used a taxi at least once in the last year. Some questions were also included to measure the level of familiarity with AVs in general and in particular with the AT system operating in China. In this occasion, respondents were also given a “standard” description of what is an AV and the different levels of automation. Finally, respondents were asked to describe the last trip made by taxi, such as origin and destination of the trip, purpose, time of the day, and where they took the taxi, etc. This information was used later to customise the choice task experiment.


Section 2: A customised SC experiment. This consisted of 6 scenarios where respondents were asked to imagine that they have to do the same trip described in the first section and they can choose between a NT and an AT. Before the SC scenarios, respondents were given four pieces of information on ATs (as described in third section in Figs. 3 and 4). The SC includes 7 attributes, 4 with 3 levels (waiting time, travel time, fixed journey fare and number of customers in the last hour) and 3 with 2 levels (change the destination, chat with an operator and customer reviews yesterday). A heterogeneous Bayesian efficient design was built using Ngene (ChoiceMetrics 2012). The design was customised based on the travel time of the last trip by taxi described by the respondent. Three segments were defined: 5 km trips (for short trips between 2.5 km to 7.5 km), 10 km trips (for medium trips between 7.5 km to 12.5 km) and 15 km trips (for long trips between 12.5 km to 17.5 km). Table 1 reports the attributes, their definition, and the levels used. The weight factors for the three segments (0.75:0.2:0.05) were computed based on the real travel distance distribution of taxi trips.Footnote 3 Bayesian D-efficient designs allow accounting for uncertainty about the true parameters. For this, a Bayesian prior parameter distribution needs to be defined. A uniform distribution was used for all parameters to avoid extreme parameter values. Priors were drawn from models estimated in several pilot tests based on orthogonal designs. 12 choice tasks in each segment were generated and randomly assigned into 2 blocks of 6 choice tasks each.

Table 1 Attributes and attributes levels

Section 3: Demographic characteristics and general information about taxi usage. This section includes several socio-demographic characteristics of the respondents such as gender, age, level of education, employment status, as well as information such as frequency of using taxi, frequency of talking with taxi driver, whether respondents enjoy talking with the taxi driver, whether respondents need help to enter the taxi etc.


Section 4: Injunctive norms statements. Three statements adapted from Cherchi (2017) were used and presented in random order to respondents. For all these three statements, a 7-point Likert response scale was used, ranging from ‘Strongly Disagree’ to ‘Strongly Agree’:

People who are important to me (friends, family) would approve of me using a fully automated taxi

People who are important to me (friends, family) would think that using a fully automated taxi is not appropriate

People who are important to me (friends, family) would think that more people should use fully automated taxis

The survey was administrated in main Chinese cities, e.g. Shanghai, Guangzhou, Changsha, Chongqing, Wuhan etc. to current normal taxi users. Data were collected between March and April 2021. The final sample was largely recruited using a panel provider (SurveyEngine) and other channels (e.g. Wechat) and it consists of 450 respondents that satisfy the requirements of having (1) more than 18 years oldFootnote 4; (2) used a normal taxi in the last year; (3) never used ATs before. A pilot test including 48 individuals were conducted on Sept. 2020 to ensure that respondents understand the choice tasks and information provided in the questionnaire.

Table 2 reports a summary of the sample characteristics and the information collected. The cities where the sample was collected have a population of approximately 74.6 million people. Our sample does not aim to be representative of this population, but we note that the distribution of age and gender in our sample is similar to the distribution of the population of these five Chinese cities (p-values of chi-square test between sample and population for gender and age is 0.18 and 0.75, indicating there are no statistically significant difference).Footnote 5 The share of people with high education in the sample is instead higher than in the population (p-values of chi-square test is 0.00).Footnote 6 The distribution of the frequency of travelling by taxi in the sample is also higher than the Chinese averageFootnote 7 for two reasons: because having taken a taxi in the last year was a screening condition and because our sample was collected in big cities, where the usage frequency of taxi is higher than the national average. Finally, regarding the distribution of income, we note that the Chinese per capita disposable income in 2020Footnote 8 was 4,220 Euros (32,200 yuan) on average nationwide, and 5,740 Euros (43,800 yuan) on average for urban residents. This means an average per-capita monthly income of approximately 350 Euros nationwide and 480 Euros at urban level. Using a taxi in China costs approximately 5 Euros for a trip of 20 min (travel time), and the percentage of disposable income spent travelling by taxi in our sample is less than 2%, with only 1% of the sample spending on average 42% of their disposable income on taxi, 4% of the sample spending 14% and 95% of the sample spending less than 8% of their disposable income travelling by taxi.

Table 2 Sample characteristics

As can be seen in Table 2, the majority of the sample uses taxi quite frequently (42.9% at least once a week) but 95% of these has a disposable income higher than 394 Euros. The majority of respondents in our sample talks only occasionally with the taxi driver while travelling, and rarely enjoy doing it. None in the sample needs help to take the taxi but more than half of the sample likes if the taxi driver helps them to carry luggage or heavy bags. Regarding knowledge of AVs, most of the respondents have heard of AVs before joining this survey, but approximately only half of them had heard of ATs operating in China. Among these latter, approximately 1/3 heard from someone who tried an AT (i.e. who had a direct experience), 2/3 heard from someone who heard of AT operating in China. Finally, 42% of the trips by taxi are for leisure purposes (e.g. shopping or pub), while in 38% of the cases are commuting or business trips.

Model specification

A hybrid choice model is used in this study, where the discrete choice part is a mixed logit that allows to account for panel effects, while the latent variable part allows to account for the impact of the latent injunctive norm. The utility \(U_{jqt}\) that individual q assigns to alternative j = [NT, AT] in choice task t = [1, 2, …, 6] takes the expression:

$$U_{jqt} = ASC_{j} + \beta_{j}^{alt} X_{jqt} + \beta_{j}^{ind} SE_{{\text{q}}} + \beta_{j}^{{alt{\text{*ind}}}} X_{jqt} SE_{q} + \beta_{j}^{LV} LV_{q} + \eta_{jq} + \varepsilon_{jqt}$$
(1)

where, \(ASC_{j}\) is the alternative specific constant for alternative j, takes value of 1 if j = AT, 0 otherwise; \(X_{jqt}\) is a vector containing all attributes used for defining the alternative in the SC experiment; \(SE_{{\text{q}}}\) is a vector of socio-economic characteristics; \(LV_{q}\) is the latent variable that measures the injunctive norm, while \(\beta_{j}^{alt}\), \(\beta_{j}^{ind}\), \(\beta_{j}^{{alt{\text{*ind}}}}\), \(\beta_{j}^{LV}\) are the vectors of coefficients that measure the marginal effect of the attributes included in the SC tasks, socio-economic characteristics, interactions between SEq and \(X_{jqt}\), and the latent variable, respectively; \(\eta_{jq}\) is the error term distributed Normal (0, \(\sigma_{\eta }\)), accounting for the correlations among choice tasks for same individual, and \(\varepsilon_{jqt}\) is the error term with iid EV1. Since, in this study, there are only 2 alternatives (AT and NT), all the terms that are alternative specific are included only in one alternative (the AT). This means that \(ASC_{j}\) \(\beta_{j}^{ind}\), \(\beta_{j}^{LV}\) and \(\eta_{jq}\) are equal to zero for j = NT.

The latent variable ‘injunctive norm’ in the structural equation is defined as:

$$LV_{q} = \alpha + \lambda SE_{q} + \omega_{q}$$
(2)

where, \(\alpha\) is the constant; \(SE_{q}\) is a vector of socio-economic characteristics that can be different from the vector in Eq. (1) and \(\lambda\) is the vector of corresponding coefficients and \(\omega_{q}\) is the normally distributed error term with mean zero and standard deviation \(\sigma_{w}\).

The measurement equation is defined as:

$$IND_{qr} = \delta_{r} + \theta_{r} LV_{q} + \upsilon_{qr} \quad r = 1,2,3$$
(3)

where \(IND_{qr}\) is the r indicator of the injunctive norm for individual q; \(\delta_{r}\) is the constant for indicator r; \(\theta_{r}\) is the coefficients associated with the latent variable; \(\upsilon_{qr}\) is the error term normally distributed with mean zero and standard deviation \(\sigma_{\upsilon_r}\).

The distribution of the latent variable and that of the indicators are:

$$f_{{LV_{q} }} = \frac{1}{{\sigma_{w} }}\phi \left( {\frac{{LV_{q} - (\alpha + \lambda SE_{q} )}}{{\sigma_{w} }}} \right);\quad f_{{IND_{qr} }} = \frac{1}{{\sigma_{{\upsilon_{r} }} }}\phi \left( {\frac{{IND_{qr} - (\delta_{r} + \theta_{r} LV_{q} )}}{{\sigma_{{\upsilon_{r} }} }}} \right)$$
(4)

\(\phi\) is the standard normal distribution function. For the purpose of theoretical identification, it is defined that \(\delta_{1} = 0\) and \(\theta_{1} = 1\) (for first indicator). Then the unconditional probability is the integral of the SC conditional probability over the distribution of \(\omega\) and \(\eta\):

$$P_{jq} = \int_{\omega } { {\int_{\eta } {\prod\limits_{t = 1}^{T} {P_{jqt} } } (LV_{q} (\omega_{q} ),\eta_{qj} )f(\eta )d\eta } } \prod\limits_{r = 1}^{R} {f_{{IND_{qr} }} (LV_{q} (\omega_{q} ))f(\omega )d\omega }$$
(5)

where \(P_{jqt}\) is the conditional probability of individual q choosing alternative j in choice task t, and takes the form of a multinomial logit model (MNL) conditional on the realisation of \(\omega\) and \(\eta\).

The log-likelihood function is given by the logarithm of the product of the unconditional probabilities:

$$LL = \sum\nolimits_{q} {\sum\nolimits_{j} {\ln (P_{jq} )} }$$
(6)

Models are estimated by maximum simulated likelihood estimation, using PythonBiogeme (Bierlaire and Fetiarison 2009).

Model results

Table 3 reports the result of the best hybrid choice model estimated using a final sample of 2700 observations. A corresponding ML is also reported for comparison, bearing in mind that the ML and the HCM have different scale parameters so the coefficients cannot be directly compared. The last two columns report the mean values and standard deviations of the estimates over 50 repetitions using randomly generated subsamples of approximately 75% of the original full sample. As can be seen, the resampling mean values are very close to the values estimated on the full sample and the standard deviation of each variable among 50 repetitions is quite small. In all 50 repetitions the coefficients estimated with the subsample were not statistically different from the coefficient estimated with the full sample, indicating that results are stable and not sensitive to the sample gathered. This is true also for the latent injunctive norm, differently from Cherchi (2017).

Table 3 Model estimation results

In the estimation results, we tested also systematic heterogeneity in the preference for the AT alternative, and systematic heterogeneity in the preference for level of service attributes, in-vehicle attributes, social conformity attributes, as well as the latent effect of injunctive norms. We also tested for income effect, but we did not find evidence of it.

Looking at the model results, we note first that all the attributes tested are highly significant at more than 95% and all the marginal utilities are as expected: negative for the level of service attributes and positive for the in-vehicle features and the attributes measuring social influence. Among the several interaction effects tested, we found that people of 60 years or older care less about waiting time (WT_60). This makes sense because older people might have more time and might be less constrained by fixed schedules or activities. Both in-vehicle features tested are statistically highly significant and positive, indicating as expected that having the possibility to change destination (CD) and to ‘chat with an operator during the trip’ (CO) increases the probability to choose an AT over a NT. The request to ‘chat with an operator’ was mentioned in the FGs by some elderly participants, and by those who said to enjoy talking and to frequently talk with a taxi driver. Based on this information we then tested if the marginal utility of the in-vehicle feature chat with an operator was different for people who enjoy talking and those who frequently talk with the driver. However, in our sample, few people reported to really enjoy talking with a driver and to frequently talk with a driver. In line with that, we also found that none of these interaction effects was significant.

Among the attributes measuring social influence, the descriptive norm number of customers in the last hour (NC) was not significant for the entire sample. This is in line with the results from the EV literature, where this attribute has always been problematic, when tested as an attribute within a SC experiment. We believe this is related to the limited level of realism that can be achieved with online surveys. Yin and Cherchi (2022) report in fact that the variable number of customers in the last hour is highly significant when SC data are collected in an experiment embedded in a virtual reality environment, and suggest that this probably reflects the importance of realism in the ability to capture the impact of normative conformity. We found however, that this descriptive norm was highly significant for those who heard of ATs from those who have used it (NC_HU). This is a plausible result, as there is of course a link between the number of customers and those who used AT, from whom the respondent heard about ATs. We also tested systematic heterogeneity in the preference for the descriptive norm as a function of demographic characteristics, trip characteristics and knowledge of AVs or ATs. None of these effects was found significant.

Confirming the results from the marketing literature, we found that good reviews (GR) measured by a high rating from yesterday’s customers, have a significant positive impact in the choice of the type of taxi. Interestingly, we found that good reviews have a higher impact for long trips (long travel distance—30 min or more) (GR_LD). In other words, those who use taxis for long trips are more sensitive to the impact of good reviews. Again, this result makes sense because the longer the time spent within a taxi the more respondents wish to be reassured about the overall quality of the service.

As expected the latent variable injunctive norms (IN) has a positive and significant impact in the choice of ATs. In our sample, we found that the young aged below 30 years old (18_29) are less likely to be affected by what other people think is right to do, probably because young people are more informed and more assertive regarding using innovative products and less prone to be influenced by others. An important point to note is that the attribute Age18_29 has a negative direct impact in the choice of ATs when included in the AT alternative in the ML. This is a counter-intuitive result, because young people are typically more likely to accept innovative modes (e.g. Haboucha et al. 2017). However, in our case, the direct impact of Age18_29 in the ML is spurious and it becomes not significant in the HCM where the attribute is included also in the IN (the direct effect was removed from the estimation in the HCM in Table 3). In our data, the correct impact of Age18_29 is an indirect effect via the IN, which is revealed correctly in the HCM.

Another interesting result is the impact of the knowledge about AT. Firstly, we note that what affects the choice of AT is not having generic knowledge of AV but having specific knowledge about AT operating in China (HO). This has an impact both directly and indirectly, as it affects the injunctive norm (IN). We also note that for the IN it is also relevant the experience of the person from whom the information is obtained. If the person from whom the respondent gets the information had direct experience with an AT (HU), the impact on the injunctive norm is more than twice than if the person did not have direct experience (HNU). It is likely that these respondents are more interested in AT and sought information from those who have tried the system and are then more likely to follow other people suggestions about AT. In terms of direct impact on the choice of ATs, it only matters that the person has heard about ATs operating in China (HO), it does not matter from who they heard about ATs.

Respondents who use frequently normal taxis (at least once a week) (FU) are also more likely to choose ATs. This result is less intuitive and might be due to the fact that they had not good experiences using NTs (e.g. unnecessary detours is a common problem in China) which increases the willingness to change to AT. At the same time, this result could also be due to curiosity of the AT services. AT services are still not something common in China, and none of respondents in this survey had experience an AT.

Table 4 reports the WTP estimated for all the attributes tested in this study. T-tests and confidence intervals are computed using Monte Carlo simulations with 5000 draws from a multivariate truncated Normal distribution. We first note that all WTP are highly significant (t-test > 1.96) and with a narrow confidence interval, with the only exception of the WTP for waiting time for people 60 years or older.

Table 4 Willingness to pay for ATs characteristics

Estimated results show that Chinese respondents are willing to pay on average 3.61 Euros to save 1 hour of travel time. This is similar to the values of 5.50 Euros found in the Netherland (Correia et al. 2019) for an hour saved in travel time travelling with an AV with office-interior (8.17 Euros per hour for AV with leisure interior). Other studies in different contexts found instead higher WTP for travel time, e.g. 12 Euros for egress travel time by fully AVs in Netherland (Yap et al. 2016) and 9.79 Euros (11.6 USD) for travel time in New York (Bansal and Daziano 2018).

Regarding waiting time, we found that those aged below 60 are willing to pay on average 7.20 Euros to save 1 hour of waiting time, about twice the amount that are willing to pay those aged 60 or above (3.71 Euros per hours). Our results show also that Chinese respondents are willing to pay on average 0.35 Euros to have the option to change the destination, which is equivalent to the amount that are willing to pay to save 5.8 min of travel time and 2.9 min (< 60 years) and 5.7 min (≥ 60 years) of waiting time. The WTP to chat with an operator inside ATs is almost twice (0.78 Euros) the WTP to change destination and almost 13 times higher than the WTP to save one minute of travel time.

Among the social influence attributes, the reviews from previous customers confirmed to be the most effective measure, users are willing to pay 1.58 Euros more to use a taxi that has got good reviews for long trips and 0.57 Euros for short/medium trips. As mentioned previously, this is a reasonable result as the longer the time spent within the AT the more respondents wish to be reassured about the quality of the service.

Conclusion

While there is some literature on the factors influencing the use of ride-shared autonomous vehicles, no studies discuss specifically the impact of in-vehicle features that an AT ought to have in order to satisfy the typical passenger requests that in a NT are dealt with a direct communication between the passenger and the driver. This paper aimed to cover this gap. Our results confirm that in-vehicle features are indeed highly important for customers, who are willing to pay 5.8 times more to have the possibility to change the destination (and almost 13 times to have the possibility to chat with an operator) than to save one minute of travel time. Our results suggest that more attention should be given to the design of direct communication within the ATs. Manufacturers should then consider equipping a direct communication with an operator and providing an option to change the destination for passengers when designing and developing ATs, as these in-vehicle features are key to attract the demand to ATs. The high impact of the request to communicate with an operator confirms the importance to ensure some “human” connection inside the ATs, which is in line with the broader concerns that technology eradicates the human innate tendency to seek connection with others. In terms of type of in-vehicle communication equipment, the recommendation would be to put an interactive ‘screen’, or even a ‘simple button’ to open the communication, rather than setting a ‘phone app’. Despite the diffusion of ‘phone apps’, respondents seem not to trust them when it comes to communicate with an operator inside the AT. If possible, it is recommended to install more than one form of communication with an operator, as there seems to be some anxiety about being in a car without a driver. In line with this result, it is also highly recommended to install CCTV cameras in all vehicles.

Another interesting recommendation for AT manufacturer refer to the car models. Almost all AV advertisements show fancy cars, very different from the current models, probably also to highlight the potentiality (e.g. activities that can be performed while riding). However, from the FGs we found that respondents have a preference for normal models, like the cars they use every day.

For the taxi operators, based on the results of the FGs, we found that of course, the condition of the vehicle (cleanliness, age, model or brand) is important for an automated as for a normal taxi, but for an AT these are not top priorities for potential customers.

This paper sheds also light on the impacts of social influences in the choice of AT. In line with previous works for electric vehicles, our results confirm the difficulty of capturing the effect of adoption rate on the choice of innovation, within online screen-based SC experiments. On the other hand, and in line with the marketing literature, we found that reviews from other customers have a strong impact on the choice of ATs. The use of a 5 star system also proved to be an effective way to report consumers reviews. The 5 star system is by far the most common format, something consumers are very familiar with, and from a methodological point of view it confirms the importance of using realistic SC scenarios. From a practical point of view, the suggestion for AT operators, is first to pay special attention to maintaining a good reputation among customers and then to use customer ratings to advertise the system, as this confirms to be particularly relevant to boost the demand, as vastly demonstrated in online shopping or hotels booking.

Our results also highlight the importance of the knowledge about AT in the adoption of this innovation. This effect has been explored for other innovation, such as EV, but not yet for ATs. Our results show that it is not the generic knowledge about AVs that matters but the specific knowledge about ATs operating in China. This stresses the importance of tailoring the information provided, for example in a marketing campaign. Finally, we note also the impact of injunctive norms in the adoption of ATs, which depends among other factors also on the experience of the person from whom the information about AT is obtained. This suggests that a word-of-mouth marketing campaign, such as organising activities that encourage interactions among customers, would be an effective approach for attracting potential AT users. Moreover, the content and form of the information delivered to the public, in accordance with the test for conveying safety information, also matters. This implies that core mainstream media and transport sectors plays a non-negligible role when delivering key information of ATs to the public. It is then advised to pay particular attention to the type of information provided, using objective information preferably from national-level government agencies, as these are valued as the most trustable source of safety information, and to make use of core media channels to inform the public and increase their knowledge on ATs.