Dynamic Clustering Algorithms via Small-Variance Analysis of Markov Chain Mixture Models,IEEE Transactions on Pattern Analysis and Machine Intelligence

当前位置： X-MOL 学术 › IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Dynamic Clustering Algorithms via Small-Variance Analysis of Markov Chain Mixture Models
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 20.8 ) Pub Date : 5-7-2018 , DOI: 10.1109/tpami.2018.2833467
Trevor Campbell , Brian Kulis , Jonathan How

Bayesian nonparametrics are a class of probabilistic models in which the model size is inferred from data. A recently developed methodology in this field is small-variance asymptotic analysis, a mathematical technique for deriving learning algorithms that capture much of the flexibility of Bayesian nonparametric inference algorithms, but are simpler to implement and less computationally expensive. Past work on small-variance analysis of Bayesian nonparametric inference algorithms has exclusively considered batch models trained on a single, static dataset, which are incapable of capturing time evolution in the latent structure of the data. This work presents a small-variance analysis of the maximum a posteriori filtering problem for a temporally varying mixture model with a Markov dependence structure, which captures temporally evolving clusters within a dataset. Two clustering algorithms result from the analysis: D-Means, an iterative clustering algorithm for linearly separable, spherical clusters; and SD-Means, a spectral clustering algorithm derived from a kernelized, relaxed version of the clustering problem. Empirical results from experiments demonstrate the advantages of using D-Means and SD-Means over contemporary clustering algorithms, in terms of both computational cost and clustering accuracy.

中文翻译：

基于马尔可夫链混合模型小方差分析的动态聚类算法

贝叶斯非参数是一类概率模型，其中模型大小是根据数据推断的。该领域最近开发的一种方法是小方差渐近分析，这是一种用于推导学习算法的数学技术，该算法具有贝叶斯非参数推理算法的大部分灵活性，但实现起来更简单，计算成本也更低。过去对贝叶斯非参数推理算法的小方差分析的研究专门考虑了在单个静态数据集上训练的批处理模型，这些模型无法捕获数据潜在结构中的时间演化。这项工作提出了具有马尔可夫依赖结构的时变混合模型的最大后验滤波问题的小方差分析，该模型捕获数据集中随时间演变的簇。分析得出两种聚类算法： D-Means，一种用于线性可分离球形聚类的迭代聚类算法； SD-Means，一种谱聚类算法，源自聚类问题的核化、宽松版本。实验的经验结果表明，在计算成本和聚类精度方面，使用 D-Means 和 SD-Means 相对于当代聚类算法具有优势。

更新日期：2024-08-22

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11