当前位置: X-MOL 学术Pattern Anal. Applic. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fuzzy kernel K-medoids clustering algorithm for uncertain data objects
Pattern Analysis and Applications ( IF 3.7 ) Pub Date : 2021-05-26 , DOI: 10.1007/s10044-021-00983-z
Behnam Tavakkol , Youngdoo Son

Most data mining algorithms are designed for traditional type of data objects which are referred to as certain data objects. Certain data objects contain no uncertainty information and are represented by a single point. Capturing uncertainty can result in better performance of algorithms as they might generate more accurate results. There are different ways of modeling uncertainty for data objects, two of the most popular ones are: (1) considering a group of points for each object and (2) considering a probability density function (pdf) for each object. Objects modeled in these ways are referred to as uncertain data objects. Fuzzy clustering is a well-established field of research for certain data. When fuzzy clustering algorithms are used, degrees of membership are generated for assignment of objects to clusters which gives the flexibility to express that objects can belong to more than one cluster. To the best of our knowledge, for uncertain data, there is only one existing fuzzy clustering algorithm in the literature. The existing uncertain fuzzy clustering algorithm, however, cannot properly create non-convex shaped clusters, and therefore, its performance is not that well on uncertain data sets with arbitrary-shaped clusters—clusters that are non-convex, unconventional, and possibly nonlinearly separable. In this paper, we propose a novel fuzzy kernel K-medoids clustering algorithm for uncertain objects which works well on data sets with arbitrary-shaped clusters. We show through several experiments on synthetic and real data that the proposed algorithm outperforms the competitor algorithms: certain fuzzy K-medoids and the uncertain fuzzy K-medoids.



中文翻译:

不确定数据对象的模糊核K-medoids聚类算法

大多数数据挖掘算法是为传统类型的数据对象(称为某些数据对象)设计的。某些数据对象不包含不确定性信息,并由单个点表示。捕获不确定性可以提高算法的性能,因为它们可能会产生更准确的结果。有多种方法可以对数据对象的不确定性进行建模,其中两种最受欢迎​​的方法是:(1)为每个对象考虑一组点,以及(2)为每个对象考虑概率密度函数(pdf)。以这些方式建模的对象称为不确定数据对象。模糊聚类是某些数据研究的公认领域。当使用模糊聚类算法时,生成隶属度以将对象分配给群集,这可以灵活地表示对象可以属于多个群集。据我们所知,对于不确定的数据,文献中只有一种现有的模糊聚类算法。但是,现有的不确定模糊聚类算法无法正确创建非凸形状的聚类,因此,在具有任意形状聚类的不确定数据集(非凸,非常规且可能是非线性可分的聚类)上,其性能不佳。在本文中,我们提出了一种针对不确定对象的模糊核K-medoids聚类算法,该算法在具有任意形状聚类的数据集上均能很好地工作。

更新日期:2021-05-26
down
wechat
bug