Bayesian and classical estimation of mixed logit: An application to genetic testing

https://doi.org/10.1016/j.jhealeco.2008.11.003Get rights and content

Abstract

Discrete choice experiments (DCEs) in health economics have recently used the mixed logit (MXL) model to incorporate preference heterogeneity. These studies typically use a classical approach to estimation or have specified normal distributions for the attributes. Specifying normal distributions can lead to erroneous interpretation; non-normal distributions may cause problems with convergence to the global maximum of the simulated log-likelihood function. Hierarchical Bayes (HB) of MXL is an alternative estimation approach that may alleviate problems of convergence. We investigated Bayesian and classical approaches to MXL estimation using a DCE that elicited preferences for a genetic technology. The classical approach produced unrealistic results in one of the econometric specifications, which led to an erroneous willingness to pay estimate. The HB procedure produced reasonable results for both specifications and helped ascertain that the classical procedures were converging at a local maximum.

Introduction

Demand models for health technologies aim to estimate the value that individuals place on different factors of a good that affect their choice. Under a discrete choice experiment (DCE) framework, respondents choose between alternatives that differ on several attributes. Including a cost attribute allows for an empirical measure of willingness to pay (WTP) for incremental changes between the attributes of the novel technology and the current standard. The multinomial logit and nested logit models are well-established behavioural specifications used to model DCE data. These models can be limiting because they assume (1) the parameter coefficient is the same for each person (or the same for each person within a group); (2) proportional substitution patterns1; and (3) independence over choices of individuals’ unobserved factors (McFadden, 1974).

The mixed logit (MXL) behavioural model is increasingly used in health economics because it allows for realistic substitution patterns and correlation of unobserved factors across choice questions or over time. The MXL can also incorporate random taste variation that permits the estimation of individual partworths, identification of outliers, and calculation of more accurate choice probabilities (Train and Sonnier, 2005). McFadden and Train show that any discrete choice random utility model can be approximated to any degree of accuracy by a MXL (McFadden and Train, 2000).

In health economics, the MXL has typically been estimated using a classical, or frequentist, approach and with attribute partworths following normal distributions or specified as fixed (Hall et al., 2006, Johnson et al., 2000, King et al., 2007, Lancsar et al., 2007). Classical estimation of MXL uses maximum simulated likelihood estimation (MSL). MSL combines the maximum likelihood estimates of a random attribute's mean and standard deviation with choices from each respondent (Huber and Train, 2001). MSL works well when normal distributions are specified for the partworths, but reaching the maximum of the simulated likelihood function can be difficult if the starting values on the parameters are far from their maximum or when bounded distributions (e.g., log-normal) are specified.

Bounded distributions can produce non-quadratic likelihood functions that create problems with estimation (Train, 2003), but are often necessary to ensure the distribution of a random parameter is consistent with economic theory. For example, in an MXL application published in this journal (Hall et al., 2006), the cost attributes (cost of genetic carrier testing for Tay-Sachs disease or cystic fibrosis) were specified to follow normal distributions. A closer look at the distribution on the cost parameter for cystic fibrosis revealed that just over 1% of the draws were positive, thereby inferring that a percentage of individuals prefer higher prices to lower prices. This result has implications if the investigators aim to estimate WTP (the ratio of the attribute's coefficient to the negative of the price coefficient). Namely, a percentage of the WTP estimates will be problematic because the cost parameter will have the wrong sign preventing the calculation of WTP, or some of the coefficients on price may be extremely small or approaching zero, which leads to WTP estimates that are undefined or untenably large.

An alternative estimation approach to MXL has been developed in the Bayesian tradition. Hierarchical Bayes (HB) of MXL uses Markov Chain Monte Carlo (MCMC) techniques to obtain the joint posterior distribution of parameters (Train, 2003). HB combines each individual's choices with population parameters estimated in the mixing distribution to derive person-specific, conditional posterior estimates (Huber and Train, 2001). The posterior embodies everything that is known about the sample and therefore contains the relevant information for finite sample inference at any sample size (Allenby and Rossi, 1999). Unlike MSL, HB does not require the maximization of a likelihood function, which alleviates problems of convergence due to poor starting values or the inclusion of bounded distributions (Train, 2003). Bayesian procedures can also be complicated to estimate. To simulate the relevant statistics, HB estimation uses an iterative process that converges with a sufficient number of iterations to draws from the posterior. Knowing if the MCMC algorithm is drawing from the distribution is not easily ascertained (Kass et al., 1998).

Despite coming from different estimation algorithms and interpretive philosophies of probability, HB and MSL procedures are related and for large samples the estimates from the two procedures may converge asymptotically (Huber and Train, 2001).2 The two procedures may provide different results in small samples because they differ in their estimation algorithms and perspective on probability (Huber and Train, 2001). With a view to inform the relative advantages of the HB and MSL estimation techniques applied to the MXL, this paper applies these approaches to a DCE eliciting societal preferences for a novel genetic technology to identify genetic causes of developmental delay/mental retardation (MR). A parallel objective of this paper is to specify heterogeneity distributions that a priori follow economic intuition and to examine the distribution of preferences for the attributes and its effect on WTP.

The remainder of this paper is divided into six sections. Section 2 provides a brief overview of MR, its affect on children and families, and a review of testing options to identify genetic causes of MR. In Section 3, the questionnaire design, recruitment process and administration are discussed. Section 4 reviews the MXL behavioural model and provides an overview of the classical and Bayesian approaches to estimating the MXL. In Section 5, the econometric modeling strategies are discussed. Section 6 presents the results of the DCE under both approaches, and Section 7 concludes with a discussion of the results and relative merits of the two procedures.

Section snippets

DCE application: genetic testing

MR affects approximately 1% of the population and impacts not only the person with MR, but also his/her family and society (Crow and Tolmie, 1998). Individuals with MR learn and develop more slowly than a typical peer of the same age (Shevell et al., 2003). Family members of children with MR may experience emotional distress, and parents often choose not to have future children unless the recurrence risk of MR is low or prenatal testing is made available (Rosenthal et al., 2001). Establishing

Identification of attributes, questionnaire design, and administration

The attributes and levels for the DCE were informed through expert interviews with geneticists and genetic counselors from the University of British Columbia. After two pilot studies eliciting preferences from geneticists in the first instance and families in the second, three attributes were included in the DCE design: (1) number of children tested whose genetic condition is identified with this test (levels: 10 in 100 tested, 14 in 100 tested, 20 in 100 tested, 25 in 100 tested); (2) time

The mixed logit model

The following section describes the MXL behavioural model and the classical and Bayesian approaches to estimating the MXL; this section draws heavily from Train (2003) and Train and Sonnier (2005).

Econometric models

Two econometric models were examined using both the classical and Bayesian approaches: model 1 (M1) was an all parameters random specification and model 2 (M2) specified coefficients that were both fixed and random. In both models, partworths were estimated for number of children tested whose genetic condition is identified with this test, time waiting for results, cost to you, and the neither test alternative. The cost attribute was scaled to be in hundreds of Canadian dollars. The neither

Results

Ethics approval for the DCE was granted from the Behavioural Research Ethics Board, University of British Columbia. 510 individuals completed all 16 choice questions; nine respondents were excluded because of evidence they ‘clicked-through’ the survey. Table 1 provides an overview of the sociodemographic variables. The average age of respondents was 49 years. Income was defined as total family income; in our sample, 18, 46 and 26% of individuals were in the low (income  $20,000), middle (income > 

Discussion

This paper applied the HB and MSL procedures to estimate a MXL with parameter distributions that followed economic intuition or were required because of concerns with numerical identification. Published studies in health economics have either solely used the classical approach to estimation and normal distributions (including fixed parameters) specified for the attributes, or have used the HB approach with normal distributions (Negrin et al., 2008). Specifying normal distributions for each

Acknowledgements

This study was supported by funding from Genome Canada, Genome British Columbia, and the Canadian Foundation for Innovation. This work was carried out whilst Dean Regier was completing his PhD at the University of Aberdeen, which was supported by a Canadian Institutes of Health Research Doctoral Research Award (Institute of Genetics) and by a University of Aberdeen 6th Century Studentship. Professor Mandy Ryan is supported by a Personal Chair in Health Economics by the University of Aberdeen.

References (40)

  • A. Gelman et al.

    Inference from iterative simulation using multiple sequences

    Statistical Science

    (1992)
  • W.M. Hanemann

    Applied welfare analysis with discrete choice models

  • D. Hensher et al.

    The mixed logit model: the state of practice

    Transportation

    (2003)
  • J. Huber et al.

    On the similarity of classical and Bayesian estimates of individual mean partworths

    Marketing Letters

    (2001)
  • F.R. Johnson et al.

    Willingness to pay for improved respiratory and cardiovascular health: a multiple-format, stated-preference approach

    Health Economics

    (2000)
  • R.E. Kass et al.

    Markov Chain Monte Carlo in practice: a roundtable discussion

    The American Statistician

    (1998)
  • M.T. King et al.

    Patient preferences for managing asthma: results from a discrete choice experiment

    Health Economics

    (2007)
  • E.J. Lancsar et al.

    Using discrete choice experiments to investigate subject preferences for preventive asthma medication

    Respirology

    (2007)
  • D. McFadden

    Conditional logit analysis of qualitative choice behavior

  • D. McFadden et al.

    Mixed MNL models for discrete response

    Journal of Applied Econometrics

    (2000)
  • Cited by (60)

    • Willingness to pay for regional electricity generation – A question of green values and regional product beliefs?

      2022, Energy Economics
      Citation Excerpt :

      Its complexity made classical estimation unfeasible, which is why we used Hierarchical Bayes (HB) estimation. Compared to Maximum Simulated Likelihood, HB has computational advantages (see, e.g., Train, 2001) and is less prone to the misspecification of starting values (Regier et al., 2009).12 For model estimation, we relied on the software R (R Core Team, 2019) and the package ‘Apollo’ (Hess and Palma, 2021a), which is a wrapper for the HB implementation by Dumont and Keller (2019).

    View all citing articles on Scopus
    View full text