当前位置: X-MOL 学术Comput. Stat. Data Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Bayesian subgroup analysis in regression using mixture models
Computational Statistics & Data Analysis ( IF 1.5 ) Pub Date : 2021-04-21 , DOI: 10.1016/j.csda.2021.107252
Yunju Im , Aixin Tan

Heterogeneity occurs in many regression problems, where members from different latent subgroups respond differently to the covariates of interest (e.g., treatments) even after adjusting for other covariates. A Bayesian model called the mixture of finite mixtures (MFM) can be used to identify these subgroups, a key feature of which is that the number of subgroups is modeled as a random variable and its distribution is learned from the data. The Bayesian MFM model was not commonly used in earlier applications largely due to computational difficulties. In comparison, an alternative infinite mixture model called the Dirichlet Process Mixture (DPM) model has been a main Bayesian tool for clustering even though it is a mis-specified model for many applications. The popularity of DPM is partly due to its convenient mathematical properties that enable efficient computing algorithms.

A class of Bayesian models tailored to regression problems, the conditional MFMs (cMFM), are described and studied. Computing for the cMFM is developed by extending the efficient MCMC algorithms for general MFMs. Using simulation and real data examples, the cMFM is compared to existing frequentist methods, the conditional DPM, and the original MFM and DPM models that model response and covariates jointly. The cMFM is shown to be favorable in clustering accuracy and is robust to different covariates and noise distributions.



中文翻译:

使用混合模型进行回归的贝叶斯亚组分析

异质性发生在许多回归问题中,其中即使调整了其他协变量,来自不同潜在亚组的成员对目标协变量(例如治疗方法)的反应也不同。可以使用贝叶斯模型(称为有限混合物的混合物(MFM))来识别这些子组,其主要特征是将子组的数量建模为随机变量,并从数据中了解其分布。由于计算上的困难,贝叶斯MFM模型在较早的应用程序中并不常用。相比之下,另一种称为Dirichlet过程混合(DPM)模型的无限混合模型已成为聚类的主要贝叶斯工具,尽管它在许多应用中均未指定正确的模型。

描述和研究了一类针对回归问题的贝叶斯模型,即条件MFM(cMFM)。通过扩展针对通用MFM的有效MCMC算法,开发了用于cMFM的计算。使用模拟和真实数据示例,将cMFM与现有的频繁使用方法,条件DPM以及共同对响应和协变量建模的原始MFM和DPM模型进行了比较。事实证明,cMFM在聚类精度方面是有利的,并且对不同的协变量和噪声分布具有鲁棒性。

更新日期:2021-05-03
down
wechat
bug