当前位置: X-MOL 学术arXiv.cs.NE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Performance evaluation results of evolutionary clustering algorithm star for clustering heterogeneous datasets
arXiv - CS - Neural and Evolutionary Computing Pub Date : 2021-04-30 , DOI: arxiv-2105.02810
Bryar A. Hassan, TarikA. Rashid, Seyedali Mirjalili

This article presents the data used to evaluate the performance of evolutionary clustering algorithm star (ECA*) compared to five traditional and modern clustering algorithms. Two experimental methods are employed to examine the performance of ECA* against genetic algorithm for clustering++ (GENCLUST++), learning vector quantisation (LVQ) , expectation maximisation (EM) , K-means++ (KM++) and K-means (KM). These algorithms are applied to 32 heterogenous and multi-featured datasets to determine which one performs well on the three tests. For one, ther paper examines the efficiency of ECA* in contradiction of its corresponding algorithms using clustering evaluation measures. These validation criteria are objective function and cluster quality measures. For another, it suggests a performance rating framework to measurethe the performance sensitivity of these algorithms on varos dataset features (cluster dimensionality, number of clusters, cluster overlap, cluster shape and cluster structure). The contributions of these experiments are two-folds: (i) ECA* exceeds its counterpart aloriths in ability to find out the right cluster number; (ii) ECA* is less sensitive towards dataset features compared to its competitive techniques. Nonetheless, the results of the experiments performed demonstrate some limitations in the ECA*: (i) ECA* is not fully applied based on the premise that no prior knowledge exists; (ii) Adapting and utilising ECA* on several real applications has not been achieved yet.

中文翻译:

聚类异构数据集的进化聚类算法之星性能评估结果

本文介绍了与五种传统和现代聚类算法相比,用于评估演化聚类算法之星(ECA *)性能的数据。两种实验方法被用来检验ECA *针对聚类++(GENCLUST ++),学习矢量量化(LVQ),期望最大化(EM),K-means ++(KM ++)和K-means(KM)的遗传算法的性能。将这些算法应用于32个异类和多特征数据集,以确定哪一个在这三个测试中表现良好。首先,本文使用聚类评估方法来检验ECA *与其相应算法相抵触的效率。这些验证标准是目标函数和聚类质量度量。为别人,它提出了一个性能评估框架,以衡量这些算法对varos数据集特征(集群维数,聚类数量,聚类重叠,聚类形状和聚类结构)的性能敏感性。这些实验的贡献有两个方面:(i)ECA *在寻找正确簇数的能力上超过了它的同类方法;(ii)与竞争技术相比,ECA *对数据集特征的敏感性较低。尽管如此,所进行的实验结果证明了ECA *的某些局限性:(i)ECA *在没有先验知识的前提下未得到充分应用;(ii)尚未在多种实际应用中适应和利用ECA *。簇形状和簇结构)。这些实验的贡献有两个方面:(i)ECA *在寻找正确簇数的能力上超过了它的同类方法;(ii)与竞争技术相比,ECA *对数据集特征的敏感性较低。尽管如此,所进行的实验结果证明了ECA *的某些局限性:(i)ECA *在没有先验知识的前提下未得到充分应用;(ii)尚未在多种实际应用中适应和利用ECA *。团簇形状和团簇结构)。这些实验的贡献有两个方面:(i)ECA *在寻找正确簇数的能力上超过了它的同类方法;(ii)与竞争技术相比,ECA *对数据集特征的敏感性较低。尽管如此,所进行的实验结果证明了ECA *的某些局限性:(i)ECA *在没有先验知识的前提下未得到充分应用;(ii)尚未在多种实际应用中适应和利用ECA *。实验结果表明,ECA *存在一些局限性:(i)在没有先验知识的前提下,ECA *未得到充分应用;(ii)尚未在多种实际应用中适应和利用ECA *。实验结果表明,ECA *存在一些局限性:(i)在没有先验知识的前提下,ECA *未得到充分应用;(ii)尚未在多种实际应用中适应和利用ECA *。
更新日期:2021-05-07
down
wechat
bug