A novel clustering ensemble model based on granular computing,Applied Intelligence

当前位置： X-MOL 学术 › Appl. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A novel clustering ensemble model based on granular computing
Applied Intelligence ( IF 3.4 ) Pub Date : 2021-01-09 , DOI: 10.1007/s10489-020-01979-8
Li Xu , Shifei Ding

Clustering ensemble is one of the popular methods in the field of data mining for discovering hidden patterns in unlabeled datasets. Researches have shown that selecting base clustering results with certain differences and high quality to participate in the fusion process can improve the quality of the final result. However, the existing inherent characteristics of uncertainty, ambiguity, and overlap of the base clustering results make the selection of the base clustering members more difficult. The accuracy of the final results is easily disturbed by low-quality base clustering members. From the perspective of granular computing, a novel clustering ensemble model is proposed. The similarity among ensemble members is measured by granularity distance, so the quality of the base clustering results is ensured meanwhile the difference among them is enlarged, which is beneficial to improve the accuracy of the final result. According to the dividing ability of knowledge granularity, the method of elements generation for the co-association matrix is optimized and improved. The results obtained from the improved sample similarity measurement are more consistent with the structure of the real data. Compared with the traditional single clustering algorithm and some popular clustering ensemble methods, experiments show that the proposed model improves the quality of the final clustering result and has good expandability.

中文翻译：

基于粒度计算的新型聚类集成模型

聚类集成是数据挖掘领域中发现未标记数据集中隐藏模式的一种流行方法。研究表明，选择具有一定差异和高质量的基础聚类结果参加融合过程可以提高最终结果的质量。但是，基本聚类结果的不确定性，歧义性和重叠性等现有固有特征使基本聚类成员的选择更加困难。低质量的基础聚类成员容易干扰最终结果的准确性。从粒度计算的角度出发，提出了一种新的聚类集成模型。集合成员之间的相似性是通过粒度距离来衡量的，这样既保证了基本聚类结果的质量，又扩大了两者之间的差异，有利于提高最终结果的准确性。根据知识粒度的划分能力，对联合矩阵的元素生成方法进行了优化和改进。从改进的样本相似性测量获得的结果与真实数据的结构更加一致。与传统的单一聚类算法和一些流行的聚类集成方法相比，实验表明该模型提高了最终聚类结果的质量，具有良好的扩展性。优化和改进了关联矩阵元素的生成方法。从改进的样本相似性测量获得的结果与真实数据的结构更加一致。与传统的单一聚类算法和一些流行的聚类集成方法相比，实验表明该模型提高了最终聚类结果的质量，具有良好的扩展性。优化和改进了关联矩阵元素的生成方法。从改进的样本相似性测量获得的结果与真实数据的结构更加一致。与传统的单一聚类算法和一些流行的聚类集成方法相比，实验表明该模型提高了最终聚类结果的质量，具有良好的扩展性。

更新日期：2021-01-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11