当前位置: X-MOL 学术J. R. Stat. Soc. B › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Estimation of the false discovery proportion with unknown dependence.
The Journal of the Royal Statistical Society, Series B (Statistical Methodology) ( IF 5.8 ) Pub Date : 2017-10-24 , DOI: 10.1111/rssb.12204
Jianqing Fan 1 , Xu Han 2
Affiliation  

Large-scale multiple testing with correlated test statistics arises frequently in many scientific research. Incorporating correlation information in approximating false discovery proportion has attracted increasing attention in recent years. When the covariance matrix of test statistics is known, Fan, Han & Gu (2012) provided an accurate approximation of False Discovery Proportion (FDP) under arbitrary dependence structure and some sparsity assumption. However, the covariance matrix is often unknown in many applications and such dependence information has to be estimated before approximating FDP. The estimation accuracy can greatly affect FDP approximation. In the current paper, we aim to theoretically study the impact of unknown dependence on the testing procedure and establish a general framework such that FDP can be well approximated. The impacts of unknown dependence on approximating FDP are in the following two major aspects: through estimating eigenvalues/eigenvectors and through estimating marginal variances. To address the challenges in these two aspects, we firstly develop general requirements on estimates of eigenvalues and eigenvectors for a good approximation of FDP. We then give conditions on the structures of covariance matrices that satisfy such requirements. Such dependence structures include banded/sparse covariance matrices and (conditional) sparse precision matrices. Within this framework, we also consider a special example to illustrate our method where data are sampled from an approximate factor model, which encompasses most practical situations. We provide a good approximation of FDP via exploiting this specific dependence structure. The results are further generalized to the situation where the multivariate normality assumption is relaxed. Our results are demonstrated by simulation studies and some real data applications.

中文翻译:

具有未知依赖性的错误发现比例的估计。

具有相关测试统计数据的大规模多重测试在许多科学研究中经常出现。近年来,以近似错误发现比例合并相关信息已引起越来越多的关注。当已知检验统计量的协方差矩阵时,Fan,Han&Gu(2012)在任意依赖结构和一些稀疏性假设下提供了错误发现比例(FDP)的准确近似值。但是,协方差矩阵在许多应用中通常是未知的,因此必须在逼近FDP之前估算此类相关性信息。估计精度会极大地影响FDP逼近。在本文中,我们旨在从理论上研究未知依赖对测试程序的影响,并建立一个通用框架,使FDP可以被很好地近似。未知依赖性对近似FDP的影响主要体现在以下两个方面:通过估计特征值/特征向量和通过估计边际方差。为了解决这两个方面的挑战,我们首先对特征值和特征向量的估计提出一般要求,以使FDP更好地近似。然后,我们给出满足此类要求的协方差矩阵结构的条件。这样的依赖性结构包括带状/稀疏协方差矩阵和(条件)稀疏精度矩阵。在此框架内,我们还考虑一个特殊的示例来说明我们的方法,该方法是从包含大多数实际情况的近似因子模型中采样数据的。通过利用这种特定的依赖关系结构,我们可以很好地近似FDP。将结果进一步推广到放宽多元正态性假设的情况。仿真研究和一些实际数据应用证明了我们的结果。
更新日期:2019-11-01
down
wechat
bug