当前位置: X-MOL 学术Fuzzy Set. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimizing the prototypes with a novel data weighting algorithm for enhancing the classification performance of fuzzy clustering
Fuzzy Sets and Systems ( IF 3.2 ) Pub Date : 2020-06-01 , DOI: 10.1016/j.fss.2020.05.009
Kaijie Xu , Witold Pedrycz , Zhiwu Li , Weike Nie

Abstract Fuzzy clustering is regarded as an unsupervised learning process that constitutes a prerequisite for many other data mining techniques. Deciding how to classify data efficiently and accurately has been one of the topics pursued by many researchers. We anticipate that the classification performance of the clustering is strongly dependent on the boundary data (viz. data located at the boundaries of the clusters). The boundary data hold some levels of uncertainties and as such contain more information than others. Usually the greater the uncertainty, the more information contained in such data. To improve the quality of clustering, this study develops an augmented scheme of fuzzy clustering, in which a novel weighted data-based fuzzy clustering is proposed. In the introduced scheme, a dataset is composed of boundary data and non-boundary data. The partition matrix is used to determine the boundary data and the non-boundary data to be next considered in the clustering process. Then, we assign different weights to each datum to construct the weighted data. During this process, we make the weights for the boundary data and the non-boundary data different, which makes the contributions of the boundary data and the non-boundary data to the prototypes being reduced and enhanced, respectively. Furthermore, we build a weighting function to determine the weights of the data. The weighted data are used to optimize the prototypes. With the optimized prototypes, the partition matrix can be refined, which ultimately makes the boundaries of the clusters optimized. Finally, the classification performance of fuzzy clustering is enhanced. We offer a thorough analysis of the developed scheme. Comprehensive experimental studies involving synthetic and publicly available datasets are reported to demonstrate the performance of the proposed approach.

中文翻译:

使用新的数据加权算法优化原型以提高模糊聚类的分类性能

摘要 模糊聚类被认为是一种无监督的学习过程,它构成了许多其他数据挖掘技术的先决条件。决定如何高效准确地对数据进行分类一直是许多研究人员追求的课题之一。我们预计聚类的分类性能强烈依赖于边界数据(即位于聚类边界的数据)。边界数据具有一定程度的不确定性,因此包含比其他数据更多的信息。通常不确定性越大,此类数据中包含的信息就越多。为了提高聚类质量,本研究开发了模糊聚类的增强方案,其中提出了一种新的基于加权数据的模糊聚类。在引入的方案中,数据集由边界数据和非边界数据组成。分区矩阵用于确定聚类过程中接下来要考虑的边界数据和非边界数据。然后,我们为每个数据分配不同的权重以构建加权数据。在这个过程中,我们使边界数据和非边界数据的权重不同,这使得边界数据和非边界数据对原型的贡献分别减少和增强。此外,我们构建了一个加权函数来确定数据的权重。加权数据用于优化原型。通过优化的原型,可以细化分区矩阵,最终使集群的边界得到优化。最后,增强了模糊聚类的分类性能。我们对开发的方案进行了彻底的分析。报道了涉及合成和公开可用数据集的综合实验研究,以证明所提出方法的性能。
更新日期:2020-06-01
down
wechat
bug