当前位置: X-MOL 学术arXiv.cs.LG › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Nonparametric Deconvolution Models
arXiv - CS - Machine Learning Pub Date : 2020-03-17 , DOI: arxiv-2003.07718
Allison J.B. Chaney, Archit Verma, Young-suk Lee, Barbara E. Engelhardt

We describe nonparametric deconvolution models (NDMs), a family of Bayesian nonparametric models for collections of data in which each observation is the average over the features from heterogeneous particles. For example, these types of data are found in elections, where we observe precinct-level vote tallies (observations) of individual citizens' votes (particles) across each of the candidates or ballot measures (features), where each voter is part of a specific voter cohort or demographic (factor). Like the hierarchical Dirichlet process, NDMs rely on two tiers of Dirichlet processes to explain the data with an unknown number of latent factors; each observation is modeled as a weighted average of these latent factors. Unlike existing models, NDMs recover how factor distributions vary locally for each observation. This uniquely allows NDMs both to deconvolve each observation into its constituent factors, and also to describe how the factor distributions specific to each observation vary across observations and deviate from the corresponding global factors. We present variational inference techniques for this family of models and study its performance on simulated data and voting data from California. We show that including local factors improves estimates of global factors and provides a novel scaffold for exploring data.

中文翻译:

非参数解卷积模型

我们描述了非参数反卷积模型 (NDM),这是一组用于数据集合的贝叶斯非参数模型,其中每个观察值都是来自异质粒子的特征的平均值。例如,在选举中可以找到这些类型的数据,在选举中我们观察每个候选人或投票措施(特征)的个人公民选票(粒子)的选区级选票(观察),其中每个选民都是特定的选民群体或人口统计(因素)。与分层狄利克雷过程一样,NDM 依靠两层狄利克雷过程来解释具有未知数量潜在因素的数据;每个观察都被建模为这些潜在因素的加权平均值。与现有模型不同,NDM 可以恢复每个观测值的因子分布在本地如何变化。这独特地允许 NDM 将每个观察结果解卷积为其组成因子,并描述每个观察特定的因子分布如何随着观察而变化并偏离相应的全局因子。我们为这一系列模型提出了变分推理技术,并研究了它在来自加利福尼亚的模拟数据和投票数据上的性能。我们表明,包括局部因素可以改善对全局因素的估计,并为探索数据提供新的支架。我们为这一系列模型提出了变分推理技术,并研究了它在来自加利福尼亚的模拟数据和投票数据上的表现。我们表明,包括局部因素可以改善对全局因素的估计,并为探索数据提供新的支架。我们为这一系列模型提出了变分推理技术,并研究了它在来自加利福尼亚的模拟数据和投票数据上的性能。我们表明,包括局部因素可以改善对全局因素的估计,并为探索数据提供新的支架。
更新日期:2020-03-18
down
wechat
bug