当前位置: X-MOL 学术Expert Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multivariate-bounded Gaussian mixture model with minimum message length criterion for model selection
Expert Systems ( IF 3.3 ) Pub Date : 2021-03-16 , DOI: 10.1111/exsy.12688
Muhammad Azam 1 , Nizar Bouguila 2
Affiliation  

Bounded support Gaussian mixture model (BGMM) has been proposed for data modelling as an alternative to unbounded support mixture models for the cases when the data lies in bounded support. In this paper, we propose applications of multivariate BGMM in data clustering for more insightful analysis of the model. We also propose minimum message length (MML) criterion for model selection in data clustering using multivariate BGMM. The presented model is applied to data clustering in several speech (TSP and Spoken Digits) and image databases (MNIST and Fashion MNIST). We also propose the application of BGMM in code-book generation at feature extraction phase. Inspired by the success of bag of visual words approach in computer vision, it is also introduced in speech data representation and validated through experiments presented in this paper. For validation of model selection criterion, MML is applied to different medical, speech and image datasets. Experimental results obtained during the model selection through MML are further compared with seven different model selection criteria. The results presented in the paper demonstrate the effectiveness of BGMM for clustering speech and image databases, code-book generation through clustering for feature representation and model selection.

中文翻译:

具有模型选择的最小消息长度准则的多元有界高斯混合模型

有界支持高斯混合模型 (BGMM) 已被提议用于数据建模,作为无界支持混合模型的替代方案,适用于数据位于有界支持的情况。在本文中,我们提出了多元 BGMM 在数据聚类中的应用,以便对模型进行更深入的分析。我们还为使用多元 BGMM 的数据聚类中的模型选择提出了最小消息长度 (MML) 标准。所提出的模型应用于多个语音(TSP 和 Spoken Digits)和图像数据库(MNIST 和 Fashion MNIST)中的数据聚类。我们还提出了 BGMM 在特征提取阶段的码本生成中的应用。受计算机视觉中视觉词袋方法的成功启发,它还被引入到语音数据表示中,并通过本文提出的实验进行了验证。为了验证模型选择标准,MML 应用于不同的医学、语音和图像数据集。通过 MML 选择模型期间获得的实验结果进一步与七种不同的模型选择标准进行了比较。论文中给出的结果证明了 BGMM 在聚类语音和图像数据库、通过聚类特征表示和模型选择生成码本的有效性。
更新日期:2021-03-16
down
wechat
bug