Normalized power prior Bayesian analysis
Introduction
In applying statistics to real experiments, it is common that the sample size in the current study is inadequate to provide enough precision for parameter estimation, while plenty of the historical data or data from similar research settings are available. For example, when designing a clinical study, historical data of the standard care might be available from other clinical studies or a patient registry. Due to the nature of sequential information updating, it is natural to use a Bayesian approach with an informative prior on the model parameters to incorporate these historical data. Though the current and historical data are usually assumed to follow distributions from the same family, the population parameters may change somewhat over different time and/or experimental settings. How to adaptively incorporate the historical data considering the data heterogeneity becomes a major concern for the informative prior elicitation.
To address this issue, Ibrahim and Chen (1998), and thereafter Chen et al. (2000), Ibrahim and Chen (2000), and Ibrahim et al. (2003) proposed the concept of power priors, based on the availability of historical data. The basic idea is to raise the likelihood function based on the historical data to a power parameter that controls the influence of the historical data. Its relationship with hierarchical models is also shown by Chen and Ibrahim (2006). For a comprehensive review of the power prior, we refer the readers to the seminar article (Ibrahim et al., 2015). The power parameter can be prefixed according to external information. It is also possible to search for a reasonable level of information borrowing from the prior-data conflict via sensitivity analysis according to certain criteria. For example, Ibrahim et al. (2012a) suggested the use of deviance information criterion (Spiegelhalter et al., 2002) or the logarithm of pseudo-marginal likelihood. The choice of would depend on the criterion of interest.
Ibrahim and Chen (2000) and Chen et al. (2000) generalized the power prior with a fixed to a random by introducing the joint power priors. They specified a joint prior distribution directly for both and , the parameters in consideration, in which an independent proper prior for was considered in addition to the original form of the power prior. Hypothetically, when the initial prior for is vague, the magnitude of borrowing would be mostly determined by the heterogeneity between the historical and the current data. However, under the joint power priors, the posterior distributions vary with the constants before the historical likelihood functions, which violates the likelihood principle (Birnbaum, 1962). It raises a critical question regarding which likelihood function should be used in practice. For example, the likelihood function based on the raw data and the likelihood function based on the sufficient statistics could differ by a multiplicative constant. This would likely yield different posteriors. Therefore, it may not be appropriate (Neuenschwander et al., 2009). Furthermore, the power parameter has a tendency to be close to zero empirically, which suggests that much of a historical data may not be used in decision making (Neelon and O’Malley, 2010).
In this article, we investigate a modified power prior which was initially proposed by Duan et al. (2006) for a random . It is named as the normalized power prior since it includes a scale factor. The normalized power prior obeys the likelihood principle. As a result, the posteriors can quantify the compatibility between the current and historical data automatically, and hence control the influence of historical data on the current study in a more sensible way.
The goals of this work are threefold. First, we review the joint power prior and the normalized power prior that have been proposed in literature. We aim to show that the joint power prior may not be appropriate for a random . Second, we carry out a comprehensive study on properties of the normalized power prior both theoretically and numerically, shed light on the posterior behavior in response to the data compatibility. Finally, we design efficient computational algorithms and provide practical implementations along with three data examples.
Section snippets
The normalized power prior
Suppose that is the parameter (vector or scalar) of interest and is the likelihood function of based on the historical data . In this article, we assume that the historical data and current data are independent random samples. Furthermore, denote by the initial prior for . Given the power parameter , Ibrahim and Chen (2000) defined the power prior of for the current study as The power parameter , a scalar in , measures the influence of
Optimality properties of the normalized power prior
In investigating the optimality properties of the normalized power priors, we use the idea of minimizing the weighted Kullback–Leibler (KL) divergence (Kullback and Leibler, 1951) that is similar to, but not the same as in Ibrahim et al. (2003).
Recall the definition of the KL divergence, where and are two densities with respect to Lebesgue measure. In Ibrahim et al. (2003), a loss function related to a target density , denoted by , is defined as the convex
Posterior behavior of the normalized power prior
In this section we investigate the posteriors of both and under different settings of the observed statistics. We show that by using the normalized power prior, the resulting posteriors can respond to the compatibility between and in an expected way. However, the posteriors are sensitive to different forms of the likelihoods under same data and model using the joint power priors.
Behavior of the square root of mean square error under the normalized power prior
We now investigate the influence of borrowing historical data in parameter estimation using the square root of the mean square error (rMSE) as the criteria. Several different approaches are compared, including the full borrowing (pooling), no borrowing, normalized power prior, and joint power prior. Two different likelihood forms are used for in the joint power priors, with the same notation as in Section 4. The rMSE obtained by the Monte Carlo method, defined as , is used
Water-quality assessment
In this example, we use measurements of pH to evaluate impairment of four sites in Virginia individually. pH data collected over a two-year or three-year period are treated as the current data, while pH data collected over the previous nine years represents one single historical data. Of interest is the determination of whether the pH values at a site indicate that the site violates a (lower) standard of 6.0 more than 10% of the time. For each site, larger sample size is associated with the
Summary and discussion
As a general class of the informative priors for Bayesian inference, the power prior provides a framework to incorporate data from alternative sources, whose influence on statistical inference can be adjusted according to its availability and its discrepancy between the current data. It is semi-automatic, in the sense that it takes the form of raising the likelihood function based on the historical data to a fractional power regardless of the specific form of heterogeneity. As a consequence of
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
We warmly thank the anonymous referees and the associate editor for helpful comments and suggestions that lead to an improved article. This work is partially supported by “the Fundamental Research Funds for the Central Universities, China” in UIBE(CXTD11-05) and a research grant by College of Business at University of Texas at San Antonio, United States .
Disclaimer
This article represents the views of the authors and should not be construed to represent FDA’s views or policies.
References (36)
- et al.
Power prior distributions for generalized linear models
J. Statist. Plann. Inference
(2000) - et al.
Prior distributions and Bayesian computation for proportional hazards models
Sankhya: Indian J. Statist. Ser. B
(1998) - et al.
Power prior distributions for regression models
Statist. Sci.
(2000) - et al.
On optimality properties of the power prior
J. Amer. Statist. Assoc.
(2003) - et al.
The relationship between the power prior and hierarchical models
Bayesian Anal.
(2006) - et al.
The power prior: Theory and applications
Stat. Med.
(2015) - et al.
Bayesian methods in clinical trials: a bayesian analysis of ECOG trials E1684 and E1690
BMC Med. Res. Methodol.
(2012) - et al.
Bayesian Measures of model complexity and fit
J. R. Stat. Soc. Ser. B Stat. Methodol.
(2002) On the foundations of statistical inference
J. Amer. Statist. Assoc.
(1962)- et al.
A note on the power prior
Stat. Med.
(2009)