Curriculum learning for multilevel budgeted combinatorial problems,arXiv - CS - Computer Science and Game Theory

当前位置： X-MOL 学术 › arXiv.cs.GT › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Curriculum learning for multilevel budgeted combinatorial problems
arXiv - CS - Computer Science and Game Theory Pub Date : 2020-07-07 , DOI: arxiv-2007.03151
Adel Nabli, Margarida Carvalho

Learning heuristics for combinatorial optimization problems through graph neural networks have recently shown promising results on some classic NP-hard problems. These are single-level optimization problems with only one player. Multilevel combinatorial optimization problems are their generalization, encompassing situations with multiple players taking decisions sequentially. By framing them in a multi-agent reinforcement learning setting, we devise a value-based method to learn to solve multilevel budgeted combinatorial problems involving two players in a zero-sum game over a graph. Our framework is based on a simple curriculum: if an agent knows how to estimate the value of instances with budgets up to $B$, then solving instances with budget $B+1$ can be done in polynomial time regardless of the direction of the optimization by checking the value of every possible afterstate. Thus, in a bottom-up approach, we generate datasets of heuristically solved instances with increasingly larger budgets to train our agent. We report results close to optimality on graphs up to $100$ nodes and a $185 \times$ speedup on average compared to the quickest exact solver known for the Multilevel Critical Node problem, a max-min-max trilevel problem that has been shown to be at least $\Sigma_2^p$-hard.

中文翻译：

多层次预算组合问题的课程学习

通过图神经网络学习组合优化问题的启发式方法最近在一些经典的 NP-hard 问题上显示出有希望的结果。这些是只有一个玩家的单级优化问题。多级组合优化问题是它们的概括，包括多个参与者依次做出决策的情况。通过在多智能体强化学习环境中构建它们，我们设计了一种基于价值的方法来学习解决涉及两个玩家在图上的零和游戏中的多层次预算组合问题。我们的框架基于一个简单的课程：如果代理知道如何估计预算高达 $B$ 的实例的价值，然后通过检查每个可能的后状态的值，无论优化的方向如何，都可以在多项式时间内解决预算为 $B+1$ 的实例。因此，在自下而上的方法中，我们以越来越大的预算生成启发式求解实例的数据集来训练我们的代理。与已知的多级关键节点问题的最快精确求解器相比，我们报告了接近 100 美元节点的图的最优性和平均 185 美元的加速比，这是一个最大-最小-最大三级问题至少 $\Sigma_2^p$-hard。

更新日期：2020-10-27

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>