Elsevier

Journal of Choice Modelling

Volume 30, March 2019, Pages 1-16
Journal of Choice Modelling

On consistency of the MACML approach to discrete choice modelling

https://doi.org/10.1016/j.jocm.2018.10.001Get rights and content

Abstract

In this paper the properties of the maximum approximate composite marginal likelihood (MACML) approach to the estimation of multinomial probit models (MNP) proposed by Chandra Bhat and coworkers is investigated with respect to asymptotic properties. It is shown that, if the choice proportions are normalized to sum to one, a variant of the method provides consistent estimates of the choice proportions for a number of approximation methods.

Furthermore it is argued that each approximation method leads to a particular mapping of regressors to choice proportions which is close - but not identical - to the map induced by the probit model. If the data are in fact generated according to this mapping then standard asymptotics, that is consistency and asymptotic normality of the estimators, hold. If the data are, however, generated by a probit model and the approximations are used for estimation, then the corresponding estimators are not guaranteed to be consistent.

Different approximation methods are subsequently analyzed with respect to their asymptotic biases and additionally with respect to finite sample properties. It is shown that normalization of the choice proportions is essential for obtaining consistent estimates of the choice proportions. Normalization also decreases biases in parameter estimates.

Introduction

Discrete choice models are routinely used for modelling mode choice, destination choice, choice of travel time, route choice, vehicle purchase decision, activity choice and many other areas (see e.g. Cascetta (2009); Train (2009)). They are used for example for the evaluation of new mode options, the design of pricing schemes for public transport, the adoption of new vehicle technologies, the impact of the provision of travel time information as well as in the simulation of transportation systems. Discrete choice models have been used in cross sectional data sets based on revealed preferences as well as panel data sets combining revealed and stated preference data (see Hess and Daly (2011)).

Most commonly discrete choice models are formulated using the random utility model (RUM) paradigm based either on the multinomial logit (MNL) or the random utility model (MNP) model. While different variants of the MNL models (including mixed models) have enjoyed a strong popularity, MNP models suffer from high computational costs. A major reason for the high computational costs is that the probit likelihood by definition involves the evaluation of the multivariate normal cumulative distribution function (MVNCDF) which is analytically intractable. Hence it is necessary to rely on approximation methods in order to evaluate the likelihood. For standard quadrature methods the relationship between the dimension of the integral and the computational complexity is of exponential order which renders those methods too time consuming for all but the smallest choice sets.

Therefore, the integral is usually approximated by Monte Carlo simulations, which depending on the number of repetitions used for simulations in comparison can lead to less accurate estimators than alternatives in some situations (see (Bhat, 2018) and (Fu and Juan, 2017)). When combined with maximum likelihood estimation those methods are known as maximum simulated likelihood (MSL) approach. The most widely used method is the Geweke-Hajivassiliou-Keane (GHK) algorithm (see e.g. (Train, 2009, p. 115) and (Bolduc, 1999)). The assessment of integrals by simulation is computationally appealing because the computational complexity of the simulation is an approximately linear function of the dimension of the integral (see Hajivassliliou, 2000, p. 88f). Like Monte Carlo methods in general MSL is justified asymptotically, which induces the need to rely on many simulation runs and, leads to biased estimates whenever the number of Monte Carlo (MC) replications is not increased fast enough as a function of the sample size (see Train, 2009, p. 250ff). Moreover the guidelines for choosing the number of replications derived from asymptotic results are only of limited help for a given sample size as they only restrict the rate of increase of the number of replications.

As an alternative, several authors have suggested that analytic approximations might have the potential to offer a faster way to estimate MNP models. In fact, the use of analytic approximations – namely the Clark approximation – to estimate MNP models predated the introduction of MSL (Daganzo et al., 1977). However, with the advent of MSL this approximation was deemed to be too imprecise and, therefore, its importance vanished (Horowitz et al., 1982). In general analytic approximations of the MVNCDF are known to be less accurate than quadrature and simulation methods (Joe, 1995) but they share neither the infeasibility of quadrature methods nor the long computation times of MSL for large choice sets.

Utilizing an analytic approximation, Bhat (2011) introduced the maximum approximate composite marginal likelihood (MACML) approach for simulation-free estimation of MNP models combining the Solow-Joe (SJ)-approximation for the MVNCDF and composite marginal likelihood (CML) estimation with the specific aim to speed up the estimation of complex MNP models. In a first simulation study MACML is reported to be up to 350 times faster than MSL estimation while the accuracy of parameter recovery was at least at par with the latter (Bhat and Sidhartan, 2011). A later simulation study revealed that the difference in computation times shrinks once MSL estimation is also performed within the CML framework but that there is still a reasonable performance gain which is then fully attributable to the analytic approximation (Patil et al., 2017). Furthermore (Patil et al., 2017), report that MACML in some settings is faster and more accurate than several competing estimation procedures including two GHK variants as well as Bayesian Markov Chain Monte Carlo (MCMC) estimation. Similar findings are also reported in (Fu and Juan, 2017).

However, the theory underlying the MACML approach still lacks an important piece as we are not aware of any results addressing the consistency properties of the estimator. While the consistency of CML methods is rather well understood (see Varin et al., 2011), it is unclear how the MVNCDF approximation interferes with estimation.

The MACML method as proposed in Bhat (2011) in based on the SJ-approximation but simulations in Connors et al. (2014) provide evidence that the Mendell-Elston (ME)-approximation is superior to the SJ-approximation with regard to accuracy of MVNCDF approximation and with respect to computation time. Furthermore (Trinh and Genz, 2015), and (Bhat, 2018) have recently presented improved variants of the ME-approximation which are also considered in this paper.

In order to enhance the understanding of the properties of MACML estimation we study the large sample properties of various MACML estimators using different approximations to the MVNCDF. We demonstrate using a simple cross sectional model that by normalizing approximated probabilities to sum to one the MACML estimators of the choice proportions for a given particular value of the regressors are consistent and asymptotically normal and hence follow standard asymptotics. In the example we also show that the estimators for the corresponding underlying parameters are potentially biased. Additionally there is no guarantee that the unnormalized choice proportions estimators are consistent.

The example subsequently is extended to investigate the estimation of the map linking regressor values to choice proportions. It is demonstrated that each approximation concept implements one particular such mapping that is close but not identical to the one implied by the MNP. The MACML estimators based on normalized proportions are shown to be consistent estimators of the mapping implied by the approximation concept used while it is potentially a biased estimator of the mapping implied by the MNP model. By way of contrast, a nonparametric estimator of the mapping will lead to a consistent estimation of the MNP model choice proportions while still leading to asymptotically biased estimators of the corresponding parameters.

Building on those insights a comparison in finite sample as well as with respect to asymptotic biases for all previously analyzed approximation concepts including MSL is given. The evidence given in this respect leads to the conclusion that – at least in the investigated examples – methods using normalized proportions provide superior estimators while between the various approximation concepts no method dominates all other methods with some methods clearly being worse than others. Even though we specifically focus on the MACML method our results have important implications whenever an analytic approximation is used in the context of MNP estimation (see e.g. Ochi and Prentice (1984), Kamakura (1989), Waddington and Thompson (2004) and Martinetti and Geniaux (2017)).

The outline of the paper is as follows: In section 2 the model used for demonstration purposes is introduced and the main underlying estimation and specification results are provided. Section 3 describes the various approximation concepts used, derives the properties of the corresponding maps connecting regressors to choice probabilities. Section 4 then presents and discusses our simulation results and section 5 concludes the paper.

Section snippets

Consistent estimation of proportions in the MNP model

In this section the asymptotic properties of the MACML estimation method are investigated. It is demonstrated that in a certain sense standard inference including consistency and asymptotic normality can be obtained under general assumptions on the approximation concept used. In this respect it will prove vital to differentiate between the estimation of choice proportions, Pj(X) say, conditional on a given regressor vector X and estimation of the underlying parameters. It will be seen that for

Methods for the approximation of choice-probabilities

This section discusses several approximation methods for the MVNCDF ΦJ(b;0,R). The approximation will be described for J=4, which leads to a three-dimensional MVNCDF, for simplicity of arguments.

As for diagonal transformation matrices D with positive diagonal entries it follows that Φ3(Db;0,DRD)=Φ3(b;0,R) we will in the following without restriction of generality assume that the coordinates have been scaled such that R=[ρij]R3×3 is a correlation matrix.

All approximation methods are based on

Numerical examples

In this section the properties of estimators obtained by maximizing the approximated composite marginal likelihood are discussed. Here estimation is performed using observations ωNΩN of sample size N where ωN accounts for all explained and explanatory variables. In the motivating example ωN=[yn]n=1,,N.

The scaled logarithm of the composite likelihood3 is denoted as llN(θ;ωN) where θΘM

Discussion and conclusion

In this paper the consistency of MACML methods based on a number of approximations is investigated. We show that standard asymptotic properties hold if the data generating process and the model coincide. Otherwise consistency can be achieved by using a flexible model obtained for example by suitable localization. We argue that approximations to the MVNCDF – when properly normalized – can be interpreted as defining a particular mapping of regressor vectors onto conditional choice proportions.

Conflicts of interest

None.

Acknowledgments

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. The authors would like to thank Daniel Rodenburger for pointing the authors to the idea of unifying the presentation of the ME and SJ approximation as well as for carrying out some preliminary simulations.

References (36)

  • R. Connors et al.

    Analytic approximation for computing probit choice probabilities

    Transportmetrica

    (2014)
  • C.F. Daganzo et al.

    Multinomial probit and qualitative choice: a computational efficient algorithm

    Transport. Sci.

    (1977)
  • X. Fu et al.

    Estimation of multinomial probit-kernel integrated choice and latent variable model: comparison on one sequential and two simultaneous approaches

    Transportation

    (2017)
  • A. Genz

    MVNXPB, a MATLAB/Octave function for the approximation of multivariate Normal probabilities

  • L. Györfi et al.

    A Distribution-free Theory of Nonparametric Regression

    (2002)
  • V.A. Hajivassliliou

    Practical issues in maximum simulated likelihood

  • S. Hess et al.

    Handbook of Choice Modelling

    (2011)
  • J.L. Horowitz et al.

    An investigation of the accuracy of the clark approximation for the multinomial probit model

    Transport. Sci.

    (1982)
  • Cited by (2)

    • Consumer choice modeling

      2019, Mapping the Travel Behavior Genome
    View full text