当前位置: X-MOL 学术J. R. Stat. Soc. B › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
False discovery rate control for high dimensional networks of quantile associations conditioning on covariates.
The Journal of the Royal Statistical Society, Series B (Statistical Methodology) ( IF 3.1 ) Pub Date : 2019-05-07 , DOI: 10.1111/rssb.12288
Jichun Xie 1 , Ruosha Li 2
Affiliation  

Motivated by gene coexpression pattern analysis, we propose a novel sample quantile contingency (SQUAC) statistic to infer quantile associations conditioning on covariates. It features enhanced flexibility in handling variables with both arbitrary distributions and complex association patterns conditioning on covariates. We first derive its asymptotic null distribution, and then develop a multiple-testing procedure based on the SQUAC statistic to test simultaneously the independence between one pair of variables conditioning on covariates for all p(p-1)/2 pairs. Here, p is the length of the outcomes and could exceed the sample size. The testing procedure does not require resampling or perturbation and thus is computationally efficient. We prove by theory and numerical experiments that this testing method asymptotically controls the false discovery rate. It outperforms all alternative methods when the complex association patterns exist. Applied to a gastric cancer data set, this testing method successfully inferred the gene coexpression networks of early and late stage patients. It identified more changes in the networks which are associated with cancer survivals. We extend our method to the case that both the length of the outcomes and the length of covariates exceed the sample size, and show that the asymptotic theory still holds.

中文翻译:

基于协变量的分位数关联的高维网络的错误发现率控制。

受基因共表达模式分析的影响,我们提出了一种新的样本分位数偶然性(SQUAC)统计量,以推断基于协变量的分位数关联。它在处理具有任意分布的变量和协变量的复杂关联模式条件方面具有增强的灵活性。我们首先导出其渐近零分布,然后基于SQUAC统计量开发一个多重测试程序,以同时测试所有p(p-1)/ 2对的协变量条件对的一对变量之间的独立性。在此,p是结果的长度,可能超过样本量。测试过程不需要重采样或干扰,因此计算效率高。我们通过理论和数值实验证明,该测试方法渐近控制了错误发现率。当存在复杂的关联模式时,它的性能优于所有其他方法。将这种测试方法应用于胃癌数据集,可以成功推断早期和晚期患者的基因共表达网络。它确定了网络中与癌症存活率相关的更多变化。我们将方法扩展到结果的长度和协变量的长度都超过样本量的情况,并表明渐近理论仍然成立。它确定了网络中与癌症存活率相关的更多变化。我们将方法扩展到结果的长度和协变量的长度都超过样本量的情况,并表明渐近理论仍然成立。它确定了网络中与癌症存活率相关的更多变化。我们将方法扩展到结果的长度和协变量的长度都超过样本量的情况,并表明渐近理论仍然成立。
更新日期:2019-11-01
down
wechat
bug