当前位置: X-MOL 学术arXiv.cs.NE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Balanced One-shot Neural Architecture Optimization
arXiv - CS - Neural and Evolutionary Computing Pub Date : 2019-09-24 , DOI: arxiv-1909.10815
Renqian Luo, Tao Qin, Enhong Chen

The ability to rank candidate architectures is the key to the performance of neural architecture search~(NAS). One-shot NAS is proposed to reduce the expense but shows inferior performance against conventional NAS and is not adequately stable. We investigate into this and find that the ranking correlation between architectures under one-shot training and the ones under stand-alone full training is poor, which misleads the algorithm to discover better architectures. Further, we show that the training of architectures of different sizes under the current one-shot method is imbalanced, which causes the evaluated performances of the architectures to be less predictable of their ground-truth performances and affects the ranking correlation heavily. Consequently, we propose Balanced NAO where we introduce balanced training of the supernet during the search procedure to encourage more updates for large architectures than small architectures by sampling architectures in proportion to their model sizes. Comprehensive experiments verify that our proposed method is effective and robust which leads to a more stable search. The final discovered architecture shows significant improvements against baselines with a test error rate of 2.60\% on CIFAR-10 and top-1 accuracy of 74.4% on ImageNet under the mobile setting. Code and model checkpoints will be publicly available. The code is available at github.com/renqianluo/NAO_pytorch.

中文翻译:

平衡的一次性神经架构优化

对候选架构进行排名的能力是神经架构搜索(NAS)性能的关键。提出一次性 NAS 以降低费用,但与传统 NAS 相比表现出较差的性能并且不够稳定。我们对此进行了调查,发现一次性训练下的架构与独立完整训练下的架构之间的排名相关性很差,这会误导算法发现更好的架构。此外,我们表明在当前的一次性方法下对不同规模的架构的训练是不平衡的,这导致架构的评估性能对其真实性能的可预测性较差,并严重影响了排名相关性。最后,我们提出了平衡 NAO,我们在搜索过程中引入了超网络的平衡训练,通过按模型大小的比例对架构进行采样,以鼓励大型架构比小型架构进行更多的更新。综合实验验证了我们提出的方法是有效且稳健的,这导致了更稳定的搜索。最终发现的架构显示出相对于基线的显着改进,在 CIFAR-10 上的测试错误率为 2.60\%,在移动设置下的 ImageNet 上的 top-1 精度为 74.4%。代码和模型检查点将公开可用。代码可在 github.com/renqianluo/NAO_pytorch 获得。综合实验验证了我们提出的方法是有效且稳健的,这导致了更稳定的搜索。最终发现的架构显示出相对于基线的显着改进,在 CIFAR-10 上的测试错误率为 2.60\%,在移动设置下的 ImageNet 上的 top-1 精度为 74.4%。代码和模型检查点将公开可用。代码可在 github.com/renqianluo/NAO_pytorch 获得。综合实验验证了我们提出的方法是有效且稳健的,这导致了更稳定的搜索。最终发现的架构显示出相对于基线的显着改进,在 CIFAR-10 上的测试错误率为 2.60\%,在移动设置下的 ImageNet 上的 top-1 精度为 74.4%。代码和模型检查点将公开可用。代码可在 github.com/renqianluo/NAO_pytorch 获得。
更新日期:2020-04-01
down
wechat
bug