当前位置: X-MOL 学术arXiv.cs.LG › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Importance Weight Estimation and Generalization in Domain Adaptation under Label Shift
arXiv - CS - Machine Learning Pub Date : 2020-11-29 , DOI: arxiv-2011.14251
Kamyar Azizzadenesheli

We study generalization under label shift in domain adaptation where the learner has access to labeled samples from the source domain but unlabeled samples from the target domain. Prior works deploy label classifiers and introduce various methods to estimate the importance weights from source to target domains. They use these estimates in importance weighted empirical risk minimization to learn classifiers. In this work, we theoretically compare the prior approaches, relax their strong assumptions, and generalize them from requiring label classifiers to general functions. This latter generalization improves the conditioning on the inverse operator of the induced inverse problems by allowing for broader exploitation of the spectrum of the forward operator. The prior works in the study of label shifts are limited to categorical label spaces. In this work, we propose a series of methods to estimate the importance weight functions for arbitrary normed label spaces. We introduce a new operator learning approach between Hilbert spaces defined on labels (rather than covariates) and show that it induces a perturbed inverse problem of compact operators. We propose a novel approach to solve the inverse problem in the presence of perturbation. This analysis has its own independent interest since such problems commonly arise in partial differential equations and reinforcement learning. For both categorical and general normed spaces, we provide concentration bounds for the proposed estimators. Using the existing generalization analysis based on Rademacher complexity, R\'enyi divergence, and MDFR lemma in Azizzadenesheli et al. [2019], we show the generalization property of the importance weighted empirical risk minimization on the unseen target domain.

中文翻译:

标签移位下域自适应的重要性权重估计与归纳

我们研究域适应中的标签移位下的泛化,学习者可以从源域访问带标签的样本,而从目标域访问未标记的样本。先前的工作部署了标签分类器,并介绍了各种方法来估计从源域到目标域的重要性权重。他们在重要性加权的经验风险最小化中使用这些估计值来学习分类器。在这项工作中,我们从理论上比较了先前的方法,放宽了它们的强力假设,并将它们从要求标签分类器推广到常规功能。后一种概括通过允许更广泛地利用前向算子的频谱,改善了对反演反问题的反算子的条件。标记移位研究的先前工作仅限于分类标记空间。在这项工作中,我们提出了一系列方法来估计任意范数标签空间的重要性权重函数。我们在标签(而不是协变量)上定义的希尔伯特空间之间引入了一种新的算子学习方法,并表明它引起了紧算子的扰动逆问题。我们提出了一种新颖的方法来解决存在摄动的逆问题。这种分析有其自己的独立利益,因为这样的问题通常出现在偏微分方程和强化学习中。对于分类和一般范数空间,我们为拟议的估计量提供了集中边界。使用现有的基于Rademacher复杂度,R'enyi散度和Azizzadenesheli等人的MDFR引理的归纳分析。[2019],
更新日期:2020-12-01
down
wechat
bug