当前位置: X-MOL 学术J. Am. Stat. Assoc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Common Atoms Model for the Bayesian Nonparametric Analysis of Nested Data
Journal of the American Statistical Association ( IF 3.7 ) Pub Date : 2021-07-14 , DOI: 10.1080/01621459.2021.1933499
Francesco Denti 1 , Federico Camerlenghi 2 , Michele Guindani 1 , Antonietta Mira 3, 4
Affiliation  

Abstract

The use of large datasets for targeted therapeutic interventions requires new ways to characterize the heterogeneity observed across subgroups of a specific population. In particular, models for partially exchangeable data are needed for inference on nested datasets, where the observations are assumed to be organized in different units and some sharing of information is required to learn distinctive features of the units. In this manuscript, we propose a nested common atoms model (CAM) that is particularly suited for the analysis of nested datasets where the distributions of the units are expected to differ only over a small fraction of the observations sampled from each unit. The proposed CAM allows a two-layered clustering at the distributional and observational level and is amenable to scalable posterior inference through the use of a computationally efficient nested slice sampler algorithm. We further discuss how to extend the proposed modeling framework to handle discrete measurements, and we conduct posterior inference on a real microbiome dataset from a diet swap study to investigate how the alterations in intestinal microbiota composition are associated with different eating habits. We further investigate the performance of our model in capturing true distributional structures in the population by means of a simulation study.



中文翻译:

嵌套数据贝叶斯非参数分析的通用原子模型

摘要

使用大型数据集进行有针对性的治疗干预需要新的方法来表征在特定人群的亚组中观察到的异质性。特别是,需要部分可交换数据的模型来对嵌套数据集进行推理,其中假定观察结果以不同的单位组织,并且需要共享一些信息以了解单位的独特特征。在这份手稿中,我们提出了一个嵌套的公共原子模型 (CAM),它特别适用于嵌套数据集的分析,其中单元的分布预计仅在从每个单元采样的一小部分观测值上有所不同。拟议的 CAM 允许在分布和观察级别进行两层聚类,并且可以通过使用计算高效的嵌套切片采样器算法进行可扩展的后验推理。我们进一步讨论了如何扩展所提出的建模框架以处理离散测量,并且我们对来自饮食交换研究的真实微生物组数据集进行后验推理,以研究肠道微生物群组成的变化如何与不同的饮食习惯相关联。我们通过模拟研究进一步研究了我们的模型在捕获人口中真实分布结构方面的性能。我们对来自饮食交换研究的真实微生物组数据集进行后验推理,以研究肠道微生物群组成的改变如何与不同的饮食习惯相关联。我们通过模拟研究进一步研究了我们的模型在捕获人口中真实分布结构方面的性能。我们对来自饮食交换研究的真实微生物组数据集进行后验推理,以研究肠道微生物群组成的改变如何与不同的饮食习惯相关联。我们通过模拟研究进一步研究了我们的模型在捕获人口中真实分布结构方面的性能。

更新日期:2021-07-14
down
wechat
bug