当前位置: X-MOL 学术J. Am. Stat. Assoc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Covariate Information Number for Feature Screening in Ultrahigh-Dimensional Supervised Problems
Journal of the American Statistical Association ( IF 3.7 ) Pub Date : 2021-02-10 , DOI: 10.1080/01621459.2020.1864380
Debmalya Nandy 1 , Francesca Chiaromonte 2, 3 , Runze Li 2
Affiliation  

Abstract

Contemporary high-throughput experimental and surveying techniques give rise to ultrahigh-dimensional supervised problems with sparse signals; that is, a limited number of observations (n), each with a very large number of covariates (pn), only a small share of which is truly associated with the response. In these settings, major concerns on computational burden, algorithmic stability, and statistical accuracy call for substantially reducing the feature space by eliminating redundant covariates before the use of any sophisticated statistical analysis. Along the lines of Pearson’s correlation coefficient-based sure independence screening and other model- and correlation-based feature screening methods, we propose a model-free procedure called covariate information number-sure independence screening (CIS). CIS uses a marginal utility connected to the notion of the traditional Fisher information, possesses the sure screening property, and is applicable to any type of response (features) with continuous features (response). Simulations and an application to transcriptomic data on rats reveal the comparative strengths of CIS over some popular feature screening methods. Supplementary materials for this article are available online.



中文翻译:

超高维监督问题中特征筛选的协变量信息数

摘要

当代高通量实验和测量技术产生了具有稀疏信号的超高维监督问题;也就是说,有限数量的观测值 ( n ),每个观测值都有非常多的协变量(pn),其中只有一小部分与响应真正相关。在这些设置中,对计算负担、算法稳定性和统计准确性的主要关注要求在使用任何复杂的统计分析之前通过消除冗余协变量来大幅减少特征空间。沿着Pearson 基于相关系数的确定独立性筛选和其他基于模型和相关性的特征筛选方法,我们提出了一种称为协变量信息数确定独立性筛选的无模型程序(独联体)。CIS利用与传统Fisher信息概念相关联的边际效用,具有确定筛选性,适用于任何类型的具有连续特征(响应)的响应(特征)。模拟和对大鼠转录组数据的应用揭示了 CIS 相对于一些流行的特征筛选方法的比较优势。本文的补充材料可在线获取。

更新日期:2021-02-10
down
wechat
bug