当前位置: X-MOL 学术arXiv.cs.GT › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An Online Learning Approach to Interpolation and Extrapolation in Domain Generalization
arXiv - CS - Computer Science and Game Theory Pub Date : 2021-02-25 , DOI: arxiv-2102.13128
Elan Rosenfeld, Pradeep Ravikumar, Andrej Risteski

A popular assumption for out-of-distribution generalization is that the training data comprises sub-datasets, each drawn from a distinct distribution; the goal is then to "interpolate" these distributions and "extrapolate" beyond them -- this objective is broadly known as domain generalization. A common belief is that ERM can interpolate but not extrapolate and that the latter is considerably more difficult, but these claims are vague and lack formal justification. In this work, we recast generalization over sub-groups as an online game between a player minimizing risk and an adversary presenting new test distributions. Under an existing notion of inter- and extrapolation based on reweighting of sub-group likelihoods, we rigorously demonstrate that extrapolation is computationally much harder than interpolation, though their statistical complexity is not significantly different. Furthermore, we show that ERM -- or a noisy variant -- is provably minimax-optimal for both tasks. Our framework presents a new avenue for the formal analysis of domain generalization algorithms which may be of independent interest.

中文翻译:

域一般化中内插法的在线学习方法

分布外泛化的一个流行假设是,训练数据包含子数据集,每个子​​数据集均来自不同的分布。然后,目标是“内插”这些分布,并在它们之外“外推”-该目标被广泛称为域概括。一个普遍的信念是,ERM可以内插但不能外推,而后者则要困难得多,但是这些主张含糊不清并且缺乏正式的理由。在这项工作中,我们将泛化泛化成一个在线游戏,使玩家将风险降至最低,而对手则提出了新的测试分布,这是一种在线游戏。在现有的基于子组似然权重的插值和外推概念下,我们严格地证明,外推在计算上比内插要困难得多,尽管它们的统计复杂度没有显着差异。此外,我们证明,对于两种任务,ERM或嘈杂的变体都证明是minimax最优的。我们的框架为领域泛化算法的形式化分析提供了一种新途径,这可能是独立关注的问题。
更新日期:2021-03-01
down
wechat
bug