当前位置: X-MOL 学术J. Am. Stat. Assoc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Grouped Heterogeneous Mixture Modeling for Clustered Data
Journal of the American Statistical Association ( IF 3.0 ) Pub Date : 2020-07-20 , DOI: 10.1080/01621459.2020.1777136
Shonosuke Sugasawa 1
Affiliation  

Clustered data is ubiquitous in a variety of scientific fields. In this paper, we propose a flexible and interpretable modeling approach, called grouped heterogenous mixture modeling, for clustered data, which models cluster-wise conditional distributions by mixtures of latent conditional distributions common to all the clusters. In the model, we assume that clusters are divided into a finite number of groups and mixing proportions are the same within the same group. We provide a simple generalized EM algorithm for computing the maximum likelihood estimator, and an information criterion to select the numbers of groups and latent distributions. We also propose structured grouping strategies by introducing penalties on grouping parameters in the likelihood function. Under the settings where both the number of clusters and cluster sizes tend to infinity, we present asymptotic properties of the maximum likelihood estimator and the information criterion. We demonstrate the proposed method through simulation studies and an application to crime risk modeling in Tokyo.

中文翻译:

聚类数据的分组异构混合建模

聚类数据在各种科学领域中无处不在。在本文中,我们提出了一种灵活且可解释的建模方法,称为分组异构混合建模,用于聚类数据,该方法通过所有聚类共有的潜在条件分布的混合对聚类条件分布进行建模。在模型中,我们假设集群被划分为有限数量的组,并且同一组内的混合比例相同。我们提供了一个简单的广义 EM 算法来计算最大似然估计量,以及一个信息标准来选择组数和潜在分布。我们还通过对似然函数中的分组参数引入惩罚来提出结构化分组策略。在簇数和簇大小都趋于无穷大的设置下,我们提出了最大似然估计量和信息准则的渐近特性。我们通过模拟研究和在东京犯罪风险建模中的应用来演示所提出的方法。
更新日期:2020-07-20
down
wechat
bug