当前位置: X-MOL 学术arXiv.cs.IT › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Moreau-Yosida $f$-divergences
arXiv - CS - Information Theory Pub Date : 2021-02-26 , DOI: arxiv-2102.13416
Dávid Terjék

Variational representations of $f$-divergences are central to many machine learning algorithms, with Lipschitz constrained variants recently gaining attention. Inspired by this, we generalize the so-called tight variational representation of $f$-divergences in the case of probability measures on compact metric spaces to be taken over the space of Lipschitz functions vanishing at an arbitrary base point, characterize functions achieving the supremum in the variational representation, propose a practical algorithm to calculate the tight convex conjugate of $f$-divergences compatible with automatic differentiation frameworks, define the Moreau-Yosida approximation of $f$-divergences with respect to the Wasserstein-$1$ metric, and derive the corresponding variational formulas, providing a generalization of a number of recent results, novel special cases of interest and a relaxation of the hard Lipschitz constraint. As an application of our theoretical results, we propose the Moreau-Yosida $f$-GAN, providing an implementation of the variational formulas for the Kullback-Leibler, reverse Kullback-Leibler, $\chi^2$, reverse $\chi^2$, squared Hellinger, Jensen-Shannon, Jeffreys, triangular discrimination and total variation divergences as GANs trained on CIFAR-10, leading to competitive results and a simple solution to the problem of uniqueness of the optimal critic.

中文翻译:

Moreau-Yosida $ f $-分歧

$ f $-分歧的变体表示形式是许多机器学习算法的核心,最近Lipschitz约束变体形式引起了人们的关注。受此启发,在要对在任意基点处消失的Lipschitz函数空间上采取的紧凑度量空间上的概率度量进行归纳的情况下,我们推广了$ f $-散度的所谓紧密变分表示,描述了实现最高的函数在变分表示中,提出一种实用的算法来计算与自动微分框架兼容的$ f $-发散的紧凸共轭,定义$ f $-发散相对于Wasserstein- $ 1 $度量的Moreau-Yosida近似,以及推导相应的变分公式,对最近的一些结果进行概括,有趣的新颖特殊情况,以及严格的Lipschitz约束的放松。作为我们理论结果的应用,我们提出了Moreau-Yosida $ f $ -GAN,为Kullback-Leibler,反向Kullback-Leibler,$ \ chi ^ 2 $,反向$ \ chi ^提供了变分公式的实现。 GAN在CIFAR-10上进行训练时,2美元,平方的赫林格(Hellinger),詹森·香农(Jensen-Shannon),杰弗里斯(Jeffreys),三角判别和总变异发散,导致竞争结果和最佳评论家的唯一性问题的简单解决方案。
更新日期:2021-03-01
down
wechat
bug