当前位置: X-MOL 学术J. Classif. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On Bayesian Analysis of Parsimonious Gaussian Mixture Models
Journal of Classification ( IF 2 ) Pub Date : 2021-06-04 , DOI: 10.1007/s00357-021-09391-8
Xiang Lu , Yaoxiang Li , Tanzy Love

Cluster analysis is the task of grouping a set of objects in such a way that objects in the same cluster are similar to each other. It is widely used in many fields including machine learning, bioinformatics, and computer graphics. In all of these applications, the partition is an inference goal, along with the number of clusters and their distinguishing characteristics. Mixtures of factor analyzers is a special case of model-based clustering which assumes the variance of each cluster comes from a factor analysis model. It simplifies the Gaussian mixture model through parameter dimension reduction and conceptually represents the variables as coming from a lower dimensional subspace where the clusters are separate. In this paper, we introduce a new RJMCMC (reversible-jump Markov chain Monte Carlo) inferential procedure for the family of constrained MFA models.

The three goals of inference here are the partition of the objects, estimation of the number of clusters, and identification and estimation of the covariance structure of the clusters; each therefore has posterior distributions. RJMCMC is the major sampling tool, which allows the dimension of the parameters to be estimated. We present simulations comparing the estimation of the clustering parameters and the partition between this inferential technique and previous methods. Finally, we illustrate these new methods with a dataset of DNA methylation measures for subjects with different brain tumor types. Our method uses four latent factors to correctly discover the five brain tumor types without assuming a constant variance structure and it classifies subjects with an excellent classification performance.



中文翻译:

关于简约高斯混合模型的贝叶斯分析

聚类分析是将一组对象分组的任务,使同一聚类中的对象彼此相似。它被广泛应用于许多领域,包括机器学习、生物信息学和计算机图形学。在所有这些应用程序中,分区是一个推理目标,还有集群的数量和它们的区别特征。因子分析器的混合是基于模型的聚类的一种特殊情况,它假设每个聚类的方差来自因子分析模型。它通过参数降维简化了高斯混合模型,并在概念上将变量表示为来自较低维子空间的变量,其中集群是分开的。在本文中,

这里推理的三个目标是对象的划分、簇数的估计、簇协方差结构的识别和估计;因此,每个都有后验分布。RJMCMC 是主要的抽样工具,它允许估计参数的维度。我们提出了比较聚类参数估计以及这种推理技术与以前方法之间的划分的模拟。最后,我们用针对不同脑肿瘤类型的受试者的 DNA 甲基化测量数据集来说明这些新方法。我们的方法使用四个潜在因素来正确发现五种脑肿瘤类型,而无需假设恒定的方差结构,并以优异的分类性能对受试者进行分类。

更新日期:2021-06-04
down
wechat
bug