当前位置: X-MOL 学术J. Am. Stat. Assoc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cauchy combination test: a powerful test with analytic p-value calculation under arbitrary dependency structures
Journal of the American Statistical Association ( IF 3.7 ) Pub Date : 2019-04-25 , DOI: 10.1080/01621459.2018.1554485
Yaowu Liu 1 , Jun Xie 2
Affiliation  

Abstract–Combining individual p-values to aggregate multiple small effects has a long-standing interest in statistics, dating back to the classic Fisher’s combination test. In modern large-scale data analysis, correlation and sparsity are common features and efficient computation is a necessary requirement for dealing with massive data. To overcome these challenges, we propose a new test that takes advantage of the Cauchy distribution. Our test statistic has a simple form and is defined as a weighted sum of Cauchy transformation of individual p-values. We prove a nonasymptotic result that the tail of the null distribution of our proposed test statistic can be well approximated by a Cauchy distribution under arbitrary dependency structures. Based on this theoretical result, the p-value calculation of our proposed test is not only accurate, but also as simple as the classic z-test or t-test, making our test well suited for analyzing massive data. We further show that the power of the proposed test is asymptotically optimal in a strong sparsity setting. Extensive simulations demonstrate that the proposed test has both strong power against sparse alternatives and a good accuracy with respect to p-value calculations, especially for very small p-values. The proposed test has also been applied to a genome-wide association study of Crohn’s disease and compared with several existing tests. Supplementary materials for this article are available online.

中文翻译:

柯西组合测试:在任意依赖结构下进行解析 p 值计算的强大测试

摘要 - 将单个 p 值组合起来以聚合多个小效应对统计学有着长期的兴趣,这可以追溯到经典的 Fisher 组合检验。在现代大规模数据分析中,相关性和稀疏性是共同特征,高效计算是处理海量数据的必要要求。为了克服这些挑战,我们提出了一种利用柯西分布的新测试。我们的检验统计量具有简单的形式,并被定义为单个 p 值的柯西变换的加权和。我们证明了一个非渐近的结果,即我们提出的测试统计量的零分布的尾部可以很好地近似为任意依赖结构下的柯西分布。基于这一理论结果,我们提出的测试的 p 值计算不仅准确,但也像经典的 z-test 或 t-test 一样简单,使我们的测试非常适合分析海量数据。我们进一步表明,在强稀疏设置中,所提出的测试的功效是渐近最优的。广泛的模拟表明,所提出的测试对稀疏替代方案具有强大的能力,并且在 p 值计算方面具有良好的准确性,尤其是对于非常小的 p 值。拟议的测试也已应用于克罗恩病的全基因组关联研究,并与几个现有的测试进行了比较。本文的补充材料可在线获取。广泛的模拟表明,所提出的测试对稀疏替代方案具有强大的能力,并且在 p 值计算方面具有良好的准确性,尤其是对于非常小的 p 值。拟议的测试也已应用于克罗恩病的全基因组关联研究,并与几个现有的测试进行了比较。本文的补充材料可在线获取。广泛的模拟表明,所提出的测试对稀疏替代方案具有强大的能力,并且在 p 值计算方面具有良好的准确性,尤其是对于非常小的 p 值。拟议的测试也已应用于克罗恩病的全基因组关联研究,并与几个现有的测试进行了比较。本文的补充材料可在线获取。
更新日期:2019-04-25
down
wechat
bug