Weighted linear programming discriminant analysis for high‐dimensional binary classification,Statistical Analysis and Data Mining

当前位置： X-MOL 学术 › Stat. Anal. Data Min. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Weighted linear programming discriminant analysis for high‐dimensional binary classification
Statistical Analysis and Data Mining ( IF 1.3 ) Pub Date : 2020-07-04 , DOI: 10.1002/sam.11473
Yufei Wu ₁ , Guan Yu ₁

Affiliation

Linear discriminant analysis (LDA) is widely used for various binary classification problems. In contrast to the LDA that estimates the precision matrix Ω and the mean difference vector δ in the classification rule separately, the linear programming discriminant (LPD) rule estimates the product Ωδ directly through a constrained ℓ₁ minimization. The LPD rule has very good classification performance on many high‐dimensional binary classification problems. However, to estimate β^* = Ωδ, the LPD rule uses equal weights for all the elements of β^* in the constrained ℓ₁ minimization. It may not deliver the optimal estimate of β^*, and therefore the estimated discriminant direction can be suboptimal. In order to obtain better estimates of β^* and the discriminant direction, we can heavily penalize β_j in the constrained ℓ₁ minimization if we suspect the jth feature is useless for the classification while moderately penalize β_j if we suspect the jth feature is useful. In this paper, based on the LPD rule and some popular feature screening methods, we propose a new weighted linear programming discriminant (WLPD) rule for the high‐dimensional binary classification problem. The screening statistics used in the marginal two‐sample t‐test screening, Kolmogorov–Smirnov filter, and the maximum marginal likelihood screening will be used to construct appropriate weights for different elements of β^* flexibly. Besides the linear programming algorithm, we develop a new alternating direction method of multipliers algorithm to solve the high‐dimensional constrained ℓ₁ minimization problem efficiently. Our numerical studies show that our proposed WLPD rule can outperform LPD and serve as an effective binary classification tool.

中文翻译：

高维二进制分类的加权线性规划判别分析

线性判别分析（LDA）被广泛用于各种二进制分类问题。与此相反，其估计的精度矩阵LDA Ω和平均差向量δ的分类规则分开，线性规划判别（LPD）规则估计产品Ω δ直接通过一个约束ℓ ₁最小化。LPD规则在许多高维二进制分类问题上具有非常好的分类性能。然而，为了估计β^* = Ω δ中，LPD规则使用相等的权重为所有的元素β^*在受约束的ℓ ₁最小化。它可能无法提供β^*的最佳估计值，因此估计的判别方向可能不是最佳的。为了获得更好的估计β^*和判别方向，我们可以重罚β _Ĵ在约束ℓ ₁，如果我们怀疑最小化Ĵ个特征是无用的分类，同时适度违法处罚β _Ĵ如果我们怀疑Ĵ该功能很有用。本文基于LPD规则和一些流行的特征筛选方法，针对高维二进制分类问题，提出了一种新的加权线性规划判别（WLPD）规则。边际两样本t检验筛选中使用的筛选统计量，Kolmogorov-Smirnov滤波器和最大边际似然筛选将用于灵活地为β^*的不同元素构建适当的权重。除了线性规划算法，我们开发了新的交替方向法的乘法器算法解决高维约束ℓ ₁有效地最小化问题。数值研究表明，我们提出的WLPD规则可以胜过LPD，并可以作为有效的二进制分类工具。

更新日期：2020-07-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>