当前位置: X-MOL 学术arXiv.cs.LG › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Masking schemes for universal marginalisers
arXiv - CS - Machine Learning Pub Date : 2020-01-16 , DOI: arxiv-2001.05895
Divya Gautam, Maria Lomeli, Kostis Gourgoulias, Daniel H. Thompson, Saurabh Johri

We consider the effect of structure-agnostic and structure-dependent masking schemes when training a universal marginaliser (arXiv:1711.00695) in order to learn conditional distributions of the form $P(x_i |\mathbf x_{\mathbf b})$, where $x_i$ is a given random variable and $\mathbf x_{\mathbf b}$ is some arbitrary subset of all random variables of the generative model of interest. In other words, we mimic the self-supervised training of a denoising autoencoder, where a dataset of unlabelled data is used as partially observed input and the neural approximator is optimised to minimise reconstruction loss. We focus on studying the underlying process of the partially observed data---how good is the neural approximator at learning all conditional distributions when the observation process at prediction time differs from the masking process during training? We compare networks trained with different masking schemes in terms of their predictive performance and generalisation properties.

中文翻译:

普遍边缘化者的掩蔽方案

为了学习 $P(x_i |\mathbf x_{\mathbf b})$ 形式的条件分布,我们在训练通用边缘化器 (arXiv:1711.00695) 时考虑了结构不可知和结构相关掩蔽方案的影响,其中$x_i$ 是给定的随机变量,$\mathbf x_{\mathbf b}$ 是感兴趣的生成模型的所有随机变量的任意子集。换句话说,我们模拟了去噪自动编码器的自监督训练,其中未标记数据的数据集用作部分观察输入,并优化神经逼近器以最小化重建损失。我们专注于研究部分观测数据的底层过程——当预测时的观测过程与训练时的掩蔽过程不同时,神经逼近器在学习所有条件分布方面有多好?我们在预测性能和泛化特性方面比较了用不同掩蔽方案训练的网络。
更新日期:2020-01-17
down
wechat
bug