当前位置: X-MOL 学术J. Comb. Optim. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A study of feature representation via neural network feature extraction and weighted distance for clustering
Journal of Combinatorial Optimization ( IF 1 ) Pub Date : 2022-02-21 , DOI: 10.1007/s10878-022-00849-y
Lily Schleider 1 , Zhecheng Qiang 1 , Qipeng P. Zheng 1 , Eduardo L. Pasiliao 2
Affiliation  

Neural Networks are well known for its performance to classify and cluster data sets via multiple layers of networks passing and transforming information pictured by raw data. The feature layer projects the raw data into a space spanned by hidden features. To understand data representations in both original (i.e., image) and feature spaces, the main purpose of this research is to analyze the clustering performance with different feature representations. Naturally, distance measures have a great impact on clustering performance. Different distances and their combinations are tested on both the original and feature spaces. The combined distances were obtained by using different optimal weights that minimize classification errors in different measures via a series of optimization models. These weights were multiplied by their respective distances in order to create the combined distance. Clustering was evaluated using silhouette scores. The feature space in general has better performance, in terms of clustering, than the image space, with Cosine Similarity being the best distance for both the image space and feature space.



中文翻译:

基于神经网络特征提取和加权距离聚类的特征表示研究

神经网络以其通过多层网络传递和转换原始数据所描绘的信息对数据集进行分类和聚类的性能而闻名。要素层将原始数据投影到由隐藏要素跨越的空间中。为了理解原始(即图像)和特征空间中的数据表示,本研究的主要目的是分析不同特征表示的聚类性能。自然,距离度量对聚类性能有很大影响。在原始空间和特征空间上测试不同的距离及其组合。组合距离是通过使用不同的最优权重获得的,这些最优权重通过一系列优化模型最小化不同度量中的分类误差。这些权重乘以它们各自的距离,以创建组合距离。使用轮廓分数评估聚类。就聚类而言,特征空间通常比图像空间具有更好的性能,余弦相似度是图像空间和特征空间的最佳距离。

更新日期:2022-02-21
down
wechat
bug