On statistical classification with incomplete covariates via filtering,Journal of Statistical Computation and Simulation

当前位置： X-MOL 学术 › J. Stat. Comput. Simul. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

On statistical classification with incomplete covariates via filtering
Journal of Statistical Computation and Simulation ( IF 1.1 ) Pub Date : 2020-12-08
Majid Mojirsheibani, My-Nhi Nguyen

This article deals with the problem of classification when some of the covariates may have missing parts. Here, it is allowed for both the training sample as well as the new unclassified observation to have missing parts in the covariates. In fact, it is shown in Remark 3.3 that in classification the reconstruction/imputation of the missing part of a new unclassified observation (which is to be classified) can be counter-productive in terms of the error rates. Furthermore, unlike many of the results in the literature, where covariate fragments are usually assumed to be missing completely at random, we do not impose such assumptions here. Given the observed parts of the covariates, we construct a kernel-type classifier which is straightforward to implement. The proposed classifier is constructed based on d-dim covariate vectors that are obtained from the original covariates (by moving from the space $L^{2}$ to $ℓ_{2}$ ), where $d (< \infty)$ itself is a parameter that has to be estimated. To estimate various parameters, we employ an easy-to-implement data-splitting approach.

中文翻译：

通过过滤对不完全协变量进行统计分类

当一些协变量可能缺少部分时，本文将解决分类问题。在这里，既允许训练样本也允许新的未分类观测值在协变量中具有缺失部分。实际上，在备注3.3中显示，在分类中，重新分类/输入新的未分类观测值（将要分类）的缺失部分在错误率方面可能适得其反。此外，与文献中的许多结果不同（协变量片段通常被假定为随机完全丢失），我们在此不施加此类假设。给定观察到的协变量部分，我们构造了一个易于实现的内核类型分类器。所提出的分类器是基于d构造的-dim从原始协变量获得的协变量向量（通过从空间移动 ${大号}^{2}$ 至 $ℓ_{2}$ ），在哪里 $d （ < \infty ）$ 本身是一个必须估计的参数。为了估计各种参数，我们采用了易于实现的数据拆分方法。

更新日期：2020-12-09

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11