Log-Linear Bayesian Additive Regression Trees for Multinomial Logistic and Count Regression Models,Journal of the American Statistical Association

当前位置： X-MOL 学术 › J. Am. Stat. Assoc. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Log-Linear Bayesian Additive Regression Trees for Multinomial Logistic and Count Regression Models
Journal of the American Statistical Association ( IF 3.0 ) Pub Date : 2020-08-26 , DOI: 10.1080/01621459.2020.1813587
Jared S. Murray ₁

Affiliation

We introduce Bayesian additive regression trees (BART) for log-linear models including multinomial logistic regression and count regression with zero-inflation and overdispersion. BART has been applied to nonparametric mean regression and binary classification problems in a range of settings. However, existing applications of BART have been limited to models for Gaussian "data", either observed or latent. This is primarily because efficient MCMC algorithms are available for Gaussian likelihoods. But while many useful models are naturally cast in terms of latent Gaussian variables, many others are not -- including models considered in this paper. We develop new data augmentation strategies and carefully specified prior distributions for these new models. Like the original BART prior, the new prior distributions are carefully constructed and calibrated to be flexible while guarding against overfitting. Together the new priors and data augmentation schemes allow us to implement an efficient MCMC sampler outside the context of Gaussian models. The utility of these new methods is illustrated with examples and an application to a previously published dataset.

中文翻译：

多项 Logistic 和计数回归模型的对数线性贝叶斯加性回归树

我们为对数线性模型引入了贝叶斯加性回归树 (BART)，包括多项逻辑回归和零膨胀和过度分散的计数回归。BART 已应用于一系列设置中的非参数均值回归和二元分类问题。然而，BART 的现有应用仅限于高斯“数据”模型，无论是观察到的还是潜在的。这主要是因为高效的 MCMC 算法可用于高斯似然。但是，虽然许多有用的模型是根据潜在高斯变量自然生成的，但许多其他模型并非如此——包括本文中考虑的模型。我们开发了新的数据增强策略，并为这些新模型仔细指定了先验分布。就像之前的原始 BART 一样，新的先验分布经过精心构建和校准，以保持灵活性，同时防止过度拟合。新的先验和数据增强方案一起使我们能够在高斯模型的上下文之外实现高效的 MCMC 采样器。这些新方法的效用通过示例和对先前发布的数据集的应用进行了说明。

更新日期：2020-08-26

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11