当前位置: X-MOL 学术Biometrics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Synthesizing external aggregated information in the presence of population heterogeneity: A penalized empirical likelihood approach
Biometrics ( IF 1.4 ) Pub Date : 2021-02-02 , DOI: 10.1111/biom.13429
Ying Sheng 1 , Yifei Sun 2 , Chiung-Yu Huang 1, 3 , Mi-Ok Kim 1, 3
Affiliation  

With the increasing availability of data in the public domain, there has been a growing interest in exploiting information from external sources to improve the analysis of smaller scale studies. An emerging challenge in the era of big data is that the subject-level data are high dimensional, but the external information is at an aggregate level and of a lower dimension. Moreover, heterogeneity and uncertainty in the auxiliary information are often not accounted for in information synthesis. In this paper, we propose a unified framework to summarize various forms of aggregated information via estimating equations and develop a penalized empirical likelihood approach to incorporate such information in logistic regression. When the homogeneity assumption is violated, we extend the method to account for population heterogeneity among different sources of information. When the uncertainty in the external information is not negligible, we propose a variance estimator adjusting for the uncertainty. The proposed estimators are asymptotically more efficient than the conventional penalized maximum likelihood estimator and enjoy the oracle property even with a diverging number of predictors. Simulation studies show that the proposed approaches yield higher accuracy in variable selection compared with competitors. We illustrate the proposed methodologies with a pediatric kidney transplant study.

中文翻译:

在存在人口异质性的情况下综合外部汇总信息:一种惩罚性经验似然法

随着公共领域数据可用性的增加,人们对利用外部来源的信息来改进对小规模研究的分析越来越感兴趣。大数据时代新出现的挑战是学科层面的数据是高维度的,而外部信息是聚合层面的,维度较低。此外,辅助信息的异质性和不确定性在信息合成中往往没有得到考虑。在本文中,我们提出了一个统一的框架,通过估计方程来总结各种形式的聚合信息,并开发了一种惩罚经验似然方法来将这些信息纳入逻辑回归。当违反同质性假设时,我们扩展了该方法以解释不同信息来源之间的人口异质性。当外部信息的不确定性不可忽略时,我们提出了一个调整不确定性的方差估计量。所提出的估计器在渐近上比传统的惩罚最大似然估计器更有效,并且即使在预测器数量不同的情况下也能享受预言机属性。仿真研究表明,与竞争对手相比,所提出的方法在变量选择方面产生了更高的准确性。我们通过小儿肾移植研究说明了所提出的方法。所提出的估计器在渐近上比传统的惩罚最大似然估计器更有效,并且即使在预测器数量不同的情况下也能享受预言机属性。仿真研究表明,与竞争对手相比,所提出的方法在变量选择方面产生了更高的准确性。我们通过小儿肾移植研究说明了所提出的方法。所提出的估计器在渐近上比传统的惩罚最大似然估计器更有效,并且即使在预测器数量不同的情况下也能享受预言机属性。仿真研究表明,与竞争对手相比,所提出的方法在变量选择方面产生了更高的准确性。我们通过小儿肾移植研究说明了所提出的方法。
更新日期:2021-02-02
down
wechat
bug