Normalized power prior Bayesian analysis

https://doi.org/10.1016/j.jspi.2021.05.005Get rights and content

Highlights

  • A normalized power prior is investigated.

  • Optimality in the sense of minimizing the Kullback–Leibler divergences is established.

  • Posterior behaviors are studied analytically and numerically.

Abstract

The elicitation of power priors, based on the availability of historical data, is realized by raising the likelihood function of the historical data to a fractional power δ, which quantifies the degree of discounting of the historical information in making inference with the current data. When δ is not pre-specified and is treated as random, it can be estimated from the data using Bayesian updating paradigm. However, in the original form of the joint power prior Bayesian approach, certain positive constants before the likelihood of the historical data could be multiplied when different settings of sufficient statistics are employed. This would change the power priors with different constants, and hence the likelihood principle is violated.

In this article, we investigate a normalized power prior approach which obeys the likelihood principle and is a modified form of the joint power prior. The optimality properties of the normalized power prior in the sense of minimizing the weighted Kullback–Leibler divergence are investigated. By examining the posteriors of several commonly used distributions, we show that the discrepancy between the historical and the current data can be well quantified by the power parameter under the normalized power prior setting. Efficient algorithms to compute the scale factor is also proposed. In addition, we illustrate the use of the normalized power prior Bayesian analysis with three data examples, and provide an implementation with an R package NPP.

Introduction

In applying statistics to real experiments, it is common that the sample size in the current study is inadequate to provide enough precision for parameter estimation, while plenty of the historical data or data from similar research settings are available. For example, when designing a clinical study, historical data of the standard care might be available from other clinical studies or a patient registry. Due to the nature of sequential information updating, it is natural to use a Bayesian approach with an informative prior on the model parameters to incorporate these historical data. Though the current and historical data are usually assumed to follow distributions from the same family, the population parameters may change somewhat over different time and/or experimental settings. How to adaptively incorporate the historical data considering the data heterogeneity becomes a major concern for the informative prior elicitation.

To address this issue, Ibrahim and Chen (1998), and thereafter Chen et al. (2000), Ibrahim and Chen (2000), and Ibrahim et al. (2003) proposed the concept of power priors, based on the availability of historical data. The basic idea is to raise the likelihood function based on the historical data to a power parameter δ (0δ1) that controls the influence of the historical data. Its relationship with hierarchical models is also shown by Chen and Ibrahim (2006). For a comprehensive review of the power prior, we refer the readers to the seminar article (Ibrahim et al., 2015). The power parameter δ can be prefixed according to external information. It is also possible to search for a reasonable level of information borrowing from the prior-data conflict via sensitivity analysis according to certain criteria. For example, Ibrahim et al. (2012a) suggested the use of deviance information criterion (Spiegelhalter et al., 2002) or the logarithm of pseudo-marginal likelihood. The choice of δ would depend on the criterion of interest.

Ibrahim and Chen (2000) and Chen et al. (2000) generalized the power prior with a fixed δ to a random δ by introducing the joint power priors. They specified a joint prior distribution directly for both δ and θ, the parameters in consideration, in which an independent proper prior for δ was considered in addition to the original form of the power prior. Hypothetically, when the initial prior for δ is vague, the magnitude of borrowing would be mostly determined by the heterogeneity between the historical and the current data. However, under the joint power priors, the posterior distributions vary with the constants before the historical likelihood functions, which violates the likelihood principle (Birnbaum, 1962). It raises a critical question regarding which likelihood function should be used in practice. For example, the likelihood function based on the raw data and the likelihood function based on the sufficient statistics could differ by a multiplicative constant. This would likely yield different posteriors. Therefore, it may not be appropriate (Neuenschwander et al., 2009). Furthermore, the power parameter has a tendency to be close to zero empirically, which suggests that much of a historical data may not be used in decision making (Neelon and O’Malley, 2010).

In this article, we investigate a modified power prior which was initially proposed by Duan et al. (2006) for a random δ. It is named as the normalized power prior since it includes a scale factor. The normalized power prior obeys the likelihood principle. As a result, the posteriors can quantify the compatibility between the current and historical data automatically, and hence control the influence of historical data on the current study in a more sensible way.

The goals of this work are threefold. First, we review the joint power prior and the normalized power prior that have been proposed in literature. We aim to show that the joint power prior may not be appropriate for a random δ. Second, we carry out a comprehensive study on properties of the normalized power prior both theoretically and numerically, shed light on the posterior behavior in response to the data compatibility. Finally, we design efficient computational algorithms and provide practical implementations along with three data examples.

Section snippets

The normalized power prior

Suppose that θ is the parameter (vector or scalar) of interest and L(θ|D0) is the likelihood function of θ based on the historical data D0. In this article, we assume that the historical data D0 and current data D are independent random samples. Furthermore, denote by π0(θ) the initial prior for θ. Given the power parameter δ, Ibrahim and Chen (2000) defined the power prior of θ for the current study as π(θ|D0,δ)L(θ|D0)δπ0(θ).The power parameter δ, a scalar in [0,1], measures the influence of

Optimality properties of the normalized power prior

In investigating the optimality properties of the normalized power priors, we use the idea of minimizing the weighted Kullback–Leibler (KL) divergence (Kullback and Leibler, 1951) that is similar to, but not the same as in Ibrahim et al. (2003).

Recall the definition of the KL divergence, K(g,f)=Θlog(g(θ)f(θ))g(θ)dθ,where g and f are two densities with respect to Lebesgue measure. In Ibrahim et al. (2003), a loss function related to a target density g, denoted by Kg, is defined as the convex

Posterior behavior of the normalized power prior

In this section we investigate the posteriors of both θ and δ under different settings of the observed statistics. We show that by using the normalized power prior, the resulting posteriors can respond to the compatibility between D0 and D in an expected way. However, the posteriors are sensitive to different forms of the likelihoods under same data and model using the joint power priors.

Behavior of the square root of mean square error under the normalized power prior

We now investigate the influence of borrowing historical data in parameter estimation using the square root of the mean square error (rMSE) as the criteria. Several different approaches are compared, including the full borrowing (pooling), no borrowing, normalized power prior, and joint power prior. Two different likelihood forms are used for D0 in the joint power priors, with the same notation as in Section 4. The rMSE obtained by the Monte Carlo method, defined as 1mi=1m(θˆ(i)θ)2, is used

Water-quality assessment

In this example, we use measurements of pH to evaluate impairment of four sites in Virginia individually. pH data collected over a two-year or three-year period are treated as the current data, while pH data collected over the previous nine years represents one single historical data. Of interest is the determination of whether the pH values at a site indicate that the site violates a (lower) standard of 6.0 more than 10% of the time. For each site, larger sample size is associated with the

Summary and discussion

As a general class of the informative priors for Bayesian inference, the power prior provides a framework to incorporate data from alternative sources, whose influence on statistical inference can be adjusted according to its availability and its discrepancy between the current data. It is semi-automatic, in the sense that it takes the form of raising the likelihood function based on the historical data to a fractional power regardless of the specific form of heterogeneity. As a consequence of

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

We warmly thank the anonymous referees and the associate editor for helpful comments and suggestions that lead to an improved article. This work is partially supported by “the Fundamental Research Funds for the Central Universities, China” in UIBE(CXTD11-05) and a research grant by College of Business at University of Texas at San Antonio, United States .

Disclaimer

This article represents the views of the authors and should not be construed to represent FDA’s views or policies.

References (36)

  • ChenM.-H. et al.

    Power prior distributions for generalized linear models

    J. Statist. Plann. Inference

    (2000)
  • IbrahimJ.G. et al.

    Prior distributions and Bayesian computation for proportional hazards models

    Sankhya: Indian J. Statist. Ser. B

    (1998)
  • IbrahimJ.G. et al.

    Power prior distributions for regression models

    Statist. Sci.

    (2000)
  • IbrahimJ.G. et al.

    On optimality properties of the power prior

    J. Amer. Statist. Assoc.

    (2003)
  • ChenM.-H. et al.

    The relationship between the power prior and hierarchical models

    Bayesian Anal.

    (2006)
  • IbrahimJ.G. et al.

    The power prior: Theory and applications

    Stat. Med.

    (2015)
  • IbrahimJ.G. et al.

    Bayesian methods in clinical trials: a bayesian analysis of ECOG trials E1684 and E1690

    BMC Med. Res. Methodol.

    (2012)
  • SpiegelhalterD.J. et al.

    Bayesian Measures of model complexity and fit

    J. R. Stat. Soc. Ser. B Stat. Methodol.

    (2002)
  • BirnbaumA.

    On the foundations of statistical inference

    J. Amer. Statist. Assoc.

    (1962)
  • NeuenschwanderB. et al.

    A note on the power prior

    Stat. Med.

    (2009)
  • NeelonB. et al.

    Bayesian analysis using power priors with application to pediatric quality of care

    J. Biometr. Biostatist.

    (2010)
  • DuanY. et al.

    Evaluating water quality using power priors to incorporate historical information

    Environmetrics

    (2006)
  • GamaloM.A. et al.

    Bayesian approach to the design and analysis of non-inferiority trials for anti-infective products

    Pharm. Statist.

    (2014)
  • GravestockI. et al.

    Power priors based on multiple historical studies for binary outcomes

    Biom. J.

    (2019)
  • BanbetaA. et al.

    Modified power prior with multiple historical trials for binary endpoints

    Stat. Med.

    (2019)
  • IbrahimJ.G. et al.

    Bayesian meta-experimental design: Evaluating cardiovascular risk in new antidiabetic therapies to treat type 2 diabetes

    Biometrics

    (2012)
  • ChenM.-H. et al.

    Bayesian design of superiority clinical trials for recurrent events data with applications to bleeding and transfusion events in myelodyplastic syndrome

    Biometrics

    (2014)
  • ChenM.-H. et al.

    Bayesian sequential meta-analysis design in evaluating cardiovascular risk in a new antidiabetic drug development program

    Stat. Med.

    (2014)
  • Cited by (0)

    View full text