A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average,Entropy

当前位置： X-MOL 学术 › Entropy › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average
Entropy ( IF 2.7 ) Pub Date : 2021-04-24 , DOI: 10.3390/e23050518
Osamu Komori , Shinto Eguchi

Clustering is a major unsupervised learning algorithm and is widely applied in data mining and statistical data analyses. Typical examples include k-means, fuzzy c-means, and Gaussian mixture models, which are categorized into hard, soft, and model-based clusterings, respectively. We propose a new clustering, called Pareto clustering, based on the Kolmogorov–Nagumo average, which is defined by a survival function of the Pareto distribution. The proposed algorithm incorporates all the aforementioned clusterings plus maximum-entropy clustering. We introduce a probabilistic framework for the proposed method, in which the underlying distribution to give consistency is discussed. We build the minorize-maximization algorithm to estimate the parameters in Pareto clustering. We compare the performance with existing methods in simulation studies and in benchmark dataset analyses to demonstrate its highly practical utilities.

中文翻译：

用Kolmogorov–Nagumo平均数统一表示k均值，模糊c均值和高斯混合模型

聚类是一种主要的无监督学习算法，已广泛应用于数据挖掘和统计数据分析中。典型示例包括k均值，模糊c均值和高斯混合模型，它们分别分为硬聚类，软聚类和基于模型的聚类。我们基于Kolmogorov-Nagumo平均数提出了一个新的聚类，称为Pareto聚类，该平均值由Pareto分布的生存函数定义。所提出的算法结合了所有上述聚类和最大熵聚类。我们为所提出的方法引入一个概率框架，其中讨论了给出一致性的基本分布。我们建立了最小化最大化算法来估计帕累托聚类中的参数。

更新日期：2021-04-24

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>