当前位置: X-MOL 学术J. Am. Stat. Assoc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Large-Scale Hypothesis Testing for Causal Mediation Effects with Applications in Genome-wide Epigenetic Studies
Journal of the American Statistical Association ( IF 3.0 ) Pub Date : 2021-05-19 , DOI: 10.1080/01621459.2021.1914634
Zhonghua Liu 1 , Jincheng Shen 2 , Richard Barfield 3 , Joel Schwartz 4 , Andrea A Baccarelli 5 , Xihong Lin 6
Affiliation  

Abstract

In genome-wide epigenetic studies, it is of great scientific interest to assess whether the effect of an exposure on a clinical outcome is mediated through DNA methylations. However, statistical inference for causal mediation effects is challenged by the fact that one needs to test a large number of composite null hypotheses across the whole epigenome. Two popular tests, the Wald-type Sobel’s test and the joint significant test using the traditional null distribution are underpowered and thus can miss important scientific discoveries. In this article, we show that the null distribution of Sobel’s test is not the standard normal distribution and the null distribution of the joint significant test is not uniform under the composite null of no mediation effect, especially in finite samples and under the singular point null case that the exposure has no effect on the mediator and the mediator has no effect on the outcome. Our results explain why these two tests are underpowered, and more importantly motivate us to develop a more powerful divide-aggregate composite-null test (DACT) for the composite null hypothesis of no mediation effect by leveraging epigenome-wide data. We adopted Efron’s empirical null framework for assessing statistical significance of the DACT test. We showed analytically that the proposed DACT method had improved power, and could well control Type I error rate. Our extensive simulation studies showed that, in finite samples, the DACT method properly controlled the Type I error rate and outperformed Sobel’s test and the joint significance test for detecting mediation effects. We applied the DACT method to the U.S. Department of Veterans Affairs Normative Aging Study, an ongoing prospective cohort study which included men who were aged 21 to 80 years at entry. We identified multiple DNA methylation CpG sites that might mediate the effect of smoking on lung function with effect sizes ranging from –0.18 to –0.79 and false discovery rate controlled at the level 0.05, including the CpG sites in the genes AHRR and F2RL3. Our sensitivity analysis found small residual correlations (less than 0.01) of the error terms between the outcome and mediator regressions, suggesting that our results are robust to unmeasured confounding factors. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.



中文翻译:


因果中介效应的大规模假设检验及其在全基因组表观遗传学研究中的应用


 抽象的


在全基因组表观遗传学研究中,评估暴露对临床结果的影响是否是通过 DNA 甲基化介导的,具有重大的科学意义。然而,因果中介效应的统计推断受到以下事实的挑战:需要在整个表观基因组中测试大量复合零假设。两种流行的检验,即 Wald 型索贝尔检验和使用传统零分布的联合显着性检验,其动力不足,因此可能会错过重要的科学发现。在本文中,我们证明了索贝尔检验的零值分布不是标准正态分布,并且在无中介效应的复合零值下,特别是在有限样本和奇点零值下,联合显着性检验的零值分布并不均匀。暴露对中介没有影响并且中介对结果没有影响的情况。我们的结果解释了为什么这两个检验的效力不足,更重要的是激励我们利用表观基因组范围的数据,针对无中介效应的复合零假设开发更强大的除聚合复合无效检验(DACT)。我们采用 Efron 的经验零框架来评估 DACT 测试的统计显着性。我们分析表明,所提出的 DACT 方法具有改进的功效,并且可以很好地控制 I 类错误率。我们广泛的模拟研究表明,在有限样本中,DACT 方法适当地控制了 I 类错误率,并且优于 Sobel 检验和检测中介效应的联合显着性检验。我们将DACT方法应用于美国 退伍军人事务部规范老龄化研究,这是一项正在进行的前瞻性队列研究,纳入对象为年龄在 21 岁至 80 岁之间的男性。我们发现了多个 DNA 甲基化 CpG 位点可能介导吸烟对肺功能的影响,效应大小范围为 –0.18 至 –0.79,错误发现率控制在 0.05 水平,包括基因 AHRR 和 F2RL3 中的 CpG 位点。我们的敏感性分析发现结果和中介回归之间的误差项存在较小的残差相关性(小于 0.01),这表明我们的结果对于未测量的混杂因素是稳健的。本文的补充材料(包括可用于复制该作品的材料的标准化描述)可作为在线补充材料获得。

更新日期:2021-05-19
down
wechat
bug