Partition-based ultrahigh-dimensional variable screening,Biometrika

当前位置： X-MOL 学术 › Biometrika › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Partition-based ultrahigh-dimensional variable screening
Biometrika ( IF 2.4 ) Pub Date : 2017-10-09 , DOI: 10.1093/biomet/asx052
Jian Kang ₁ , Hyokyoung G Hong ₂ , Y I Li ₁

Affiliation

Traditional variable selection methods are compromised by overlooking useful information on covariates with similar functionality or spatial proximity, and by treating each covariate independently. Leveraging prior grouping information on covariates, we propose partition-based screening methods for ultrahigh-dimensional variables in the framework of generalized linear models. We show that partition-based screening exhibits the sure screening property with a vanishing false selection rate, and we propose a data-driven partition screening framework with unavailable or unreliable prior knowledge on covariate grouping and investigate its theoretical properties. We consider two special cases: correlation-guided partitioning and spatial location- guided partitioning. In the absence of a single partition, we propose a theoretically justified strategy for combining statistics from various partitioning methods. The utility of the proposed methods is demonstrated via simulation and analysis of functional neuroimaging data.

中文翻译：

基于分区的超高维变量筛选

传统的变量选择方法因忽略具有相似功能或空间邻近性的协变量的有用信息以及独立处理每个协变量而受到损害。利用协变量的先验分组信息，我们在广义线性模型的框架中提出了基于分区的超高维变量筛选方法。我们证明基于分区的筛选表现出确定的筛选特性，并且错误选择率消失，并且我们提出了一种数据驱动的分区筛选框架，该框架具有关于协变量分组的不可用或不可靠的先验知识，并研究了其理论特性。我们考虑两种特殊情况：相关性引导分区和空间位置引导分区。在没有单个分区的情况下，我们提出了一种理论上合理的策略来组合来自各种分区方法的统计数据。通过功能神经影像数据的模拟和分析证明了所提出方法的实用性。

更新日期：2017-10-09

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11