当前位置: X-MOL 学术arXiv.cs.LG › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A multi-objective perspective on jointly tuning hardware and hyperparameters
arXiv - CS - Machine Learning Pub Date : 2021-06-10 , DOI: arxiv-2106.05680
David Salinas, Valerio Perrone, Olivier Cruchant, Cedric Archambeau

In addition to the best model architecture and hyperparameters, a full AutoML solution requires selecting appropriate hardware automatically. This can be framed as a multi-objective optimization problem: there is not a single best hardware configuration but a set of optimal ones achieving different trade-offs between cost and runtime. In practice, some choices may be overly costly or take days to train. To lift this burden, we adopt a multi-objective approach that selects and adapts the hardware configuration automatically alongside neural architectures and their hyperparameters. Our method builds on Hyperband and extends it in two ways. First, we replace the stopping rule used in Hyperband by a non-dominated sorting rule to preemptively stop unpromising configurations. Second, we leverage hyperparameter evaluations from related tasks via transfer learning by building a probabilistic estimate of the Pareto front that finds promising configurations more efficiently than random search. We show in extensive NAS and HPO experiments that both ingredients bring significant speed-ups and cost savings, with little to no impact on accuracy. In three benchmarks where hardware is selected in addition to hyperparameters, we obtain runtime and cost reductions of at least 5.8x and 8.8x, respectively. Furthermore, when applying our multi-objective method to the tuning of hyperparameters only, we obtain a 10\% improvement in runtime while maintaining the same accuracy on two popular NAS benchmarks.

中文翻译:

联合调整硬件和超参数的多目标视角

除了最佳模型架构和超参数之外,完整的 AutoML 解决方案还需要自动选择合适的硬件。这可以被定义为一个多目标优化问题:没有单一的最佳硬件配置,而是一组实现成本和运行时间之间不同权衡的最佳硬件配置。在实践中,某些选择可能成本过高或需要数天时间进行培训。为了减轻这个负担,我们采用了一种多目标方法,自动选择和调整硬件配置以及神经架构及其超参数。我们的方法建立在 Hyperband 的基础上,并以两种方式对其进行了扩展。首先,我们将 Hyperband 中使用的停止规则替换为非支配排序规则,以抢先停止无希望的配置。第二,我们通过构建帕累托前沿的概率估计,通过迁移学习利用来自相关任务的超参数评估,该概率估计比随机搜索更有效地找到有希望的配置。我们在广泛的 NAS 和 HPO 实验中表明,这两种成分都带来了显着的加速和成本节约,对准确性几乎没有影响。在除了超参数之外还选择硬件的三个基准测试中,我们分别获得了至少 5.8 倍和 8.8 倍的运行时间和成本降低。此外,当仅将我们的多目标方法应用于超参数的调整时,我们在两个流行的 NAS 基准测试中保持相同精度的同时,运行时间提高了 10%。我们在广泛的 NAS 和 HPO 实验中表明,这两种成分都带来了显着的加速和成本节约,对准确性几乎没有影响。在除了超参数之外还选择硬件的三个基准测试中,我们分别获得了至少 5.8 倍和 8.8 倍的运行时间和成本降低。此外,当仅将我们的多目标方法应用于超参数的调整时,我们在两个流行的 NAS 基准测试中保持相同精度的同时,运行时间提高了 10%。我们在广泛的 NAS 和 HPO 实验中表明,这两种成分都带来了显着的加速和成本节约,对准确性几乎没有影响。在除了超参数之外还选择硬件的三个基准测试中,我们分别获得了至少 5.8 倍和 8.8 倍的运行时间和成本降低。此外,当仅将我们的多目标方法应用于超参数的调整时,我们在两个流行的 NAS 基准测试中保持相同精度的同时,运行时间提高了 10%。
更新日期:2021-06-11
down
wechat
bug