当前位置: X-MOL 学术J. R. Stat. Soc. B › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
EigenPrism: inference for high dimensional signal-to-noise ratios.
The Journal of the Royal Statistical Society, Series B (Statistical Methodology) ( IF 5.8 ) Pub Date : 2017-11-07 , DOI: 10.1111/rssb.12203
Lucas Janson 1 , Rina Foygel Barber 2 , Emmanuel Candès 1
Affiliation  

Consider the following three important problems in statistical inference, namely, constructing confidence intervals for (1) the error of a high-dimensional (p > n) regression estimator, (2) the linear regression noise level, and (3) the genetic signal-to-noise ratio of a continuous-valued trait (related to the heritability). All three problems turn out to be closely related to the little-studied problem of performing inference on the [Formula: see text]-norm of the signal in high-dimensional linear regression. We derive a novel procedure for this, which is asymptotically correct when the covariates are multivariate Gaussian and produces valid confidence intervals in finite samples as well. The procedure, called EigenPrism, is computationally fast and makes no assumptions on coefficient sparsity or knowledge of the noise level. We investigate the width of the EigenPrism confidence intervals, including a comparison with a Bayesian setting in which our interval is just 5% wider than the Bayes credible interval. We are then able to unify the three aforementioned problems by showing that the EigenPrism procedure with only minor modifications is able to make important contributions to all three. We also investigate the robustness of coverage and find that the method applies in practice and in finite samples much more widely than just the case of multivariate Gaussian covariates. Finally, we apply EigenPrism to a genetic dataset to estimate the genetic signal-to-noise ratio for a number of continuous phenotypes.

中文翻译:

EigenPrism:高维信噪比的推断。

考虑统计推断中的以下三个重要问题,即为(1)高维(p> n)回归估计量的误差,(2)线性回归噪声水平和(3)遗传信号构建置信区间连续值性状的信噪比(与遗传力有关)。这三个问题都与在高维线性回归中对信号的[公式]范数执行推理的研究很少的问题密切相关。我们为此推导了一个新颖的过程,当协变量是多元高斯时,它渐近正确,并且在有限样本中也产生有效的置信区间。该过程称为EigenPrism,计算速度很快,并且不对系数稀疏性或噪声水平知识做任何假设。我们研究了EigenPrism置信区间的宽度,包括与贝叶斯设置的比较,在该设置中我们的区间仅比贝叶斯可信区间宽5%。然后,我们可以证明只有少量修改的EigenPrism程序就能对所有这三个问题做出重要贡献,从而统一上述三个问题。我们还研究了覆盖范围的鲁棒性,发现该方法在实践中和有限样本中的适用范围比多元高斯协变量的情况要广泛得多。最后,我们将EigenPrism应用于遗传数据集,以估计许多连续表型的遗传信噪比。然后,我们可以通过证明仅进行少量修改的EigenPrism程序就能对所有这三个问题做出重要贡献,从而统一上述三个问题。我们还研究了覆盖范围的鲁棒性,发现该方法在实践中和有限样本中的适用范围比多元高斯协变量的情况要广泛得多。最后,我们将EigenPrism应用于遗传数据集,以估计许多连续表型的遗传信噪比。然后,我们可以证明只有少量修改的EigenPrism程序就能对所有这三个问题做出重要贡献,从而统一上述三个问题。我们还研究了覆盖范围的鲁棒性,发现该方法在实践中和有限样本中的适用范围比多元高斯协变量的情况要广泛得多。最后,我们将EigenPrism应用于遗传数据集,以估计许多连续表型的遗传信噪比。我们还研究了覆盖范围的鲁棒性,发现该方法在实践中和有限样本中的应用远比多元高斯协变量的情况更广泛。最后,我们将EigenPrism应用于遗传数据集,以估计许多连续表型的遗传信噪比。我们还研究了覆盖范围的鲁棒性,发现该方法在实践中和有限样本中的应用远比多元高斯协变量的情况更广泛。最后,我们将EigenPrism应用于遗传数据集,以估计许多连续表型的遗传信噪比。
更新日期:2019-11-01
down
wechat
bug