The Complexity of Constrained Min-Max Optimization,arXiv - CS - Computational Complexity

当前位置： X-MOL 学术 › arXiv.cs.CC › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

The Complexity of Constrained Min-Max Optimization
arXiv - CS - Computational Complexity Pub Date : 2020-09-21 , DOI: arxiv-2009.09623
Constantinos Daskalakis and Stratis Skoulakis and Manolis Zampetakis

Despite its important applications in Machine Learning, min-max optimization of nonconvex-nonconcave objectives remains elusive. Not only are there no known first-order methods converging even to approximate local min-max points, but the computational complexity of identifying them is also poorly understood. In this paper, we provide a characterization of the computational complexity of the problem, as well as of the limitations of first-order methods in constrained min-max optimization problems with nonconvex-nonconcave objectives and linear constraints. As a warm-up, we show that, even when the objective is a Lipschitz and smooth differentiable function, deciding whether a min-max point exists, in fact even deciding whether an approximate min-max point exists, is NP-hard. More importantly, we show that an approximate local min-max point of large enough approximation is guaranteed to exist, but finding one such point is PPAD-complete. The same is true of computing an approximate fixed point of Gradient Descent/Ascent. An important byproduct of our proof is to establish an unconditional hardness result in the Nemirovsky-Yudin model. We show that, given oracle access to some function $f : P \to [-1, 1]$ and its gradient $\nabla f$, where $P \subseteq [0, 1]^d$ is a known convex polytope, every algorithm that finds a $\varepsilon$-approximate local min-max point needs to make a number of queries that is exponential in at least one of $1/\varepsilon$, $L$, $G$, or $d$, where $L$ and $G$ are respectively the smoothness and Lipschitzness of $f$ and $d$ is the dimension. This comes in sharp contrast to minimization problems, where finding approximate local minima in the same setting can be done with Projected Gradient Descent using $O(L/\varepsilon)$ many queries. Our result is the first to show an exponential separation between these two fundamental optimization problems.

中文翻译：

受限最小-最大优化的复杂性

尽管它在机器学习中具有重要应用，但非凸非凹目标的最小-最大优化仍然难以捉摸。不仅没有已知的一阶方法收敛甚至近似局部最小-最大点，而且识别它们的计算复杂性也知之甚少。在本文中，我们描述了问题的计算复杂度，以及一阶方法在具有非凸非凹目标和线性约束的约束最小-最大优化问题中的局限性。作为热身，我们表明，即使目标是 Lipschitz 和平滑可微函数，决定是否存在最小-最大点，实际上甚至决定是否存在近似的最小-最大点，也是 NP-hard。更重要的是，我们表明，保证存在一个足够大的近似局部最小-最大点，但找到一个这样的点是 PPAD 完全的。计算梯度下降/上升的近似固定点也是如此。我们证明的一个重要副产品是在 Nemirovsky-Yudin 模型中建立无条件硬度结果。我们证明，给定 oracle 访问某个函数 $f : P \to [-1, 1]$ 及其梯度 $\nabla f$，其中 $P \subseteq [0, 1]^d$ 是一个已知的凸多面体，找到 $\varepsilon$-approximate local min-max 点的每个算法都需要在 $1/\varepsilon$、$L$、$G$ 或 $d$ 中的至少一个中进行多次指数查询，其中$L$和$G$分别是$f$和$d$的平滑度和Lipschitzness是维度。这与最小化问题形成鲜明对比，可以使用 $O(L/\varepsilon)$ 许多查询通过投影梯度下降在相同的设置中找到近似的局部最小值。我们的结果首次显示了这两个基本优化问题之间的指数分离。

更新日期：2020-09-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文