当前位置: X-MOL 学术Adv. Data Anal. Classif. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
PCA-KL: a parametric dimensionality reduction approach for unsupervised metric learning
Advances in Data Analysis and Classification ( IF 1.6 ) Pub Date : 2021-01-07 , DOI: 10.1007/s11634-020-00434-3
Alexandre L. M. Levada

Dimensionality reduction algorithms are powerful mathematical tools for data analysis and visualization. In many pattern recognition applications, a feature extraction step is often required to mitigate the curse of the dimensionality, a collection of negative effects caused by an arbitrary increase in the number of features in classification tasks. Principal Component Analysis (PCA) is a classical statistical method that creates new features based on linear combinations of the original ones through the eigenvectors of the covariance matrix. In this paper, we propose PCA-KL, a parametric dimensionality reduction algorithm for unsupervised metric learning, based on the computation of the entropic covariance matrix, a surrogate for the covariance matrix of the data obtained in terms of the relative entropy between local Gaussian distributions instead of the usual Euclidean distance between the data points. Numerical experiments with several real datasets show that the proposed method is capable of producing better defined clusters and also higher classification accuracy in comparison to regular PCA and several manifold learning algorithms, making PCA-KL a promising alternative for unsupervised metric learning.



中文翻译:

PCA-KL:用于无监督度量学习的参数降维方法

降维算法是用于数据分析和可视化的强大数学工具。在许多模式识别应用程序中,通常需要特征提取步骤来减轻维数的诅咒,这是由分类任务中特征数量的任意增加引起的负面影响的集合。主成分分析(PCA)是一种经典的统计方法,它通过协方差矩阵的特征向量基于原始特征的线性组合来创建新特征。在本文中,我们基于熵协方差矩阵的计算,提出了PCA-KL,这是一种用于无监督度量学习的参数降维算法,根据局部高斯分布之间的相对熵而不是数据点之间通常的欧几里得距离获得的数据的协方差矩阵的替代。与几个实际数据集的数值实验表明,与常规PCA和几种流形学习算法相比,该方法能够产生更好的定义聚类,并且具有更高的分类精度,这使PCA-KL成为无监督度量学习的有希望的替代方法。

更新日期:2021-01-08
down
wechat
bug