当前位置: X-MOL 学术Comput. Stat. Data Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Partition-based feature screening for categorical data via RKHS embeddings
Computational Statistics & Data Analysis ( IF 1.5 ) Pub Date : 2021-01-14 , DOI: 10.1016/j.csda.2021.107176
Jun Lu , Lu Lin , WenWu Wang

This paper proposes a new screening procedure for the ultrahigh dimensional data with a categorical response. By exploiting the group structure among predictors, a new partition-based screening approach is developed via the reproducing kernel Hilbert space (RKHS) embeddings in the maximum mean discrepancy framework. Consequently, the new method is able to identify the influential group of predictors that may be overlooked by the marginal screening methods. Moreover, by using the RKHS embedding, the new ranking index has a very simple form, and thus can be evaluated easily. As a by-product, the new method is model-free without specifying any relationship between the predictors and the response. The sure screening property of the proposed method is proved and the effectiveness of the new method is also illustrated via numerical studies and a real data analysis.



中文翻译:

通过RKHS嵌入对分类数据进行基于分区的特征筛选

本文提出了一种具有分类响应的超高维数据筛选新方法。通过利用预测变量之间的组结构,通过在最大均值差异框架中重现内核Hilbert空间(RKHS)嵌入,开发了一种新的基于分区的筛选方法。因此,新方法能够识别可能被边缘筛选方法忽略的有影响力的预测因子组。此外,通过使用RKHS嵌入,新的排名索引具有非常简单的形式,因此可以轻松进行评估。作为副产品,新方法无需模型,无需指定预测变量和响应之间的任何关系。

更新日期:2021-01-25
down
wechat
bug