Composite quantile‐based classifiers,Statistical Analysis and Data Mining

当前位置： X-MOL 学术 › Stat. Anal. Data Min. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Composite quantile‐based classifiers
Statistical Analysis and Data Mining ( IF 2.1 ) Pub Date : 2020-05-05 , DOI: 10.1002/sam.11460
David A. Pritchard ₁ , Yufeng Liu _{1,

2,

3}

Affiliation

Accurate classification of high‐dimensional data is important in many scientific applications. We propose a family of high‐dimensional classification methods based upon a comparison of the component‐wise distances of the feature vector of a sample to the within‐class population quantiles. These methods are motivated by the fact that quantile classifiers based on these component‐wise distances are the most powerful univariate classifiers for an optimal choice of the quantile level. A simple aggregation approach for constructing a multivariate classifier based upon these component‐wise distances to the within‐class quantiles is proposed. It is shown that this classifier is consistent with the asymptotically optimal classifier as the sample size increases. Our proposed classifiers result in simple piecewise‐linear decision rule boundaries that can be efficiently trained. Numerical results are shown to demonstrate competitive performance for the proposed classifiers on both simulated data and a benchmark email spam application.

中文翻译：

基于复合分位数的分类器

在许多科学应用中，高维数据的准确分类很重要。我们基于样本特征向量与类内总体分位数的分量距离的比较，提出了一系列高维分类方法。这些方法的动机是基于这些分量距离的分位数分类器是最优化分位数级别的最强大的单变量分类器。提出了一种简单的聚合方法来构造基于这些分类器到类内分位数的距离的多元分类器。结果表明，随着样本数量的增加，该分类器与渐近最优分类器一致。我们提出的分类器产生了简单的分段线性决策规则边界，可以有效地对其进行训练。结果表明，数值结果证明了拟议分类器在模拟数据和基准电子邮件垃圾邮件应用程序上的竞争性能。

更新日期：2020-05-05

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11