当前位置: X-MOL 学术J. Comput. Graph. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Trace Ratio Optimization for High-Dimensional Multi-Class Discrimination
Journal of Computational and Graphical Statistics ( IF 2.4 ) Pub Date : 2020-08-14
Jeongyoun Ahn, Hee Cheol Chung, Yongho Jeon

In multi-class discrimination with high-dimensional data, identifying a lower-dimensional subspace with maximum class separation is crucial. We propose a new optimization criterion for finding such a discriminant subspace, which is the ratio of two traces: the trace of between-class scatter matrix and the trace of within-class scatter matrix. Since this problem is not well-defined for high-dimensional data, we propose to regularize the within trace and maximize the between trace. A careful investigation reveals that this optimization has an innate connection to the eigenvalue decomposition of an indefinite matrix. For the sake of better interpretability of the classifier, we also consider a sparse estimation via a group-wise soft-thresholding. Interesting relationships between the proposed method and some classical methods such as Fisher’s linear discriminant analysis and its variants are discussed. Empirical examples with simulated and real data sets suggest that the proposed method works well and is often better than some existing approaches in a wide range of problems, with respect to both variable selectivity and classification accuracy.



中文翻译:

高维多类识别的迹线比率优化

在具有高维数据的多类区分中,识别具有最大类分离的低维子空间至关重要。我们提出了一个新的优化准则来找到这样的判别子空间,该准则是两条迹线的比率:类间散布矩阵的迹线和类内散布矩阵的迹线。由于对于高维数据而言,此问题的定义不是很明确,因此我们建议对轨迹内部进行正则化,并使轨迹之间的距离最大化。仔细研究发现,此优化与不确定矩阵的特征值分解有先天的联系。为了更好地解释分类器,我们还考虑通过基于组的软阈值进行稀疏估计。讨论了所提出的方法与一些经典方法(例如Fisher线性判别分析及其变体)之间的有趣关系。带有模拟和真实数据集的经验示例表明,就变量选择性和分类准确性而言,所提出的方法行之有效,并且在许多问题上通常优于某些现有方法。

更新日期:2020-08-14
down
wechat
bug