当前位置: X-MOL 学术arXiv.cs.LG › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time
arXiv - CS - Machine Learning Pub Date : 2020-09-22 , DOI: arxiv-2009.10623
Ferran Alet, Kenji Kawaguchi, Maria Bauza, Nurullah Giray Kuru, Tomas Lozano-Perez, Leslie Pack Kaelbling

From CNNs to attention mechanisms, encoding inductive biases into neural networks has been a fruitful source of improvement in machine learning. Auxiliary losses are a general way of encoding biases in order to help networks learn better representations by adding extra terms to the loss function. However, since they are minimized on the training data, they suffer from the same generalization gap as regular task losses. Moreover, by changing the loss function, the network is optimizing a different objective than the one we care about. In this work we solve both problems: first, we take inspiration from \textit{transductive learning} and note that, after receiving an input but before making a prediction, we can fine-tune our models on any unsupervised objective. We call this process tailoring, because we customize the model to each input. Second, we formulate a nested optimization (similar to those in meta-learning) and train our models to perform well on the task loss after adapting to the tailoring loss. The advantages of tailoring and meta-tailoring are discussed theoretically and demonstrated empirically on several diverse examples: encoding inductive conservation laws from physics to improve predictions, improving local smoothness to increase robustness to adversarial examples, and using contrastive losses on the query image to improve generalization.

中文翻译:

裁剪:通过在预测时优化无监督目标来编码归纳偏差

从 CNN 到注意力机制,将归纳偏差编码到神经网络中一直是改进机器学习的一个富有成效的来源。辅助损失是一种编码偏差的通用方法,目的是通过向损失函数添加额外项来帮助网络学习更好的表示。但是,由于它们在训练数据上被最小化,因此它们与常规任务丢失具有相同的泛化差距。此外,通过改变损失函数,网络正在优化一个与我们关心的目标不同的目标。在这项工作中,我们解决了这两个问题:首先,我们从 \textit{transductive learning} 中获得灵感,并注意到,在接收到输入之后但在进行预测之前,我们可以在任何无监督目标上微调我们的模型。我们将此过程称为定制,因为我们为每个输入定制模型。第二,我们制定了嵌套优化(类似于元学习中的优化)并训练我们的模型在适应裁剪损失后在任务损失上表现良好。剪裁和元剪裁的优势在几个不同的例子上进行了理论上的讨论和实证证明:编码物理学中的归纳守恒定律以改善预测,提高局部平滑度以增加对抗样本的鲁棒性,以及在查询图像上使用对比损失来提高泛化能力.
更新日期:2020-10-15
down
wechat
bug