当前位置: X-MOL 学术IET Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Conditional semi-fuzzy c-means clustering for imbalanced dataset
IET Image Processing ( IF 2.0 ) Pub Date : 2020-09-07 , DOI: 10.1049/iet-ipr.2019.0253
Yunlong Gao 1 , Chengyu Yang 2 , Kuo‐Yi Lin 2 , Jinyan Pan 3 , Li Li 2
Affiliation  

Fuzzy c-means algorithms have been widely utilised in several areas such as image segmentation, pattern recognition and data mining. However, the related studies showed the limitations in facing imbalanced datasets. The maximum fuzzy boundary tends to be located on the largest cluster which is not desirable. The overall fuzzy partition results in false grouping of edge objects and weakens the compactness of cluster. It is important the clusters are delineated by the maximum fuzzy boundary. In this study, a semi-fuzzy c-means algorithm is proposed by combining hard partition and soft partition. This study aims to provide an effective partition for the edge objects, such that the compactness of cluster can be improved. The proposed algorithm integrates the semi-fuzzy c-means method with the size-insensitive integrity-based fuzzy c-means algorithm. In particular, the latter algorithm has the ability to deal with imbalanced data. With the experiment validation, the proposed algorithm is robust and outperforms the two component algorithms by using synthetic and widely known benchmark datasets.

中文翻译:

有条件的半模糊 C均值聚类的不平衡数据集

模糊c均值算法已广泛应用于图像分割,模式识别和数据挖掘等多个领域。但是,相关研究表明,面对不平衡的数据集存在局限性。最大模糊边界倾向于位于最大簇上,这是不希望的。整体模糊划分会导致边缘对象的错误分组,并削弱聚类的紧密性。重要的是,用最大模糊边界来描绘聚类。本研究提出了一种结合硬分区和软分区的半模糊c均值算法。这项研究旨在为边缘对象提供有效的分区,从而可以提高群集的紧凑性。该算法将半模糊c均值方法与基于大小不敏感完整性的模糊c均值算法相结合。特别地,后一种算法具有处理不平衡数据的能力。通过实验验证,所提出的算法是健壮的,并且通过使用合成的和广为人知的基准数据集优于两种分量算法。
更新日期:2020-09-08
down
wechat
bug