当前位置: X-MOL 学术Sci. China Inf. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
EAT-NAS: elastic architecture transfer for accelerating large-scale neural architecture search
Science China Information Sciences ( IF 8.8 ) Pub Date : 2021-08-06 , DOI: 10.1007/s11432-020-3112-8
Jiemin Fang 1, 2 , Wenyu Liu 2 , Xinggang Wang 2 , Qian Zhang 3 , Chang Huang 3 , Yukang Chen 4 , Xinbang Zhang 4 , Gaofeng Meng 4
Affiliation  

Neural architecture search (NAS) methods have been proposed to relieve human experts from tedious architecture engineering. However, most current methods are constrained in small-scale search owing to the issue of huge computational resource consumption. Meanwhile, the direct application of architectures searched on small datasets to large datasets often bears no performance guarantee due to the discrepancy between different datasets. This limitation impedes the wide use of NAS on large-scale tasks. To overcome this obstacle, we propose an elastic architecture transfer mechanism for accelerating large-scale NAS (EAT-NAS). In our implementations, the architectures are first searched on a small dataset, e.g., CIFAR-10. The best one is chosen as the basic architecture. The search process on a large dataset, e.g., ImageNet, is initialized with the basic architecture as the seed. The large-scale search process is accelerated with the help of the basic architecture. We propose not only a NAS method but also a mechanism for architecture-level transfer learning. In our experiments, we obtain two final models EATNet-A and EATNet-B, which achieve competitive accuracies of 75.5% and 75.6%, respectively, on ImageNet. Both the models also surpass the models searched from scratch on ImageNet under the same settings. For the computational cost, EAT-NAS takes only fewer than 5 days using 8 TITAN X GPUs, which is significantly less than the computational consumption of the state-of-the-art large-scale NAS methods.



中文翻译:

EAT-NAS:用于加速大规模神经架构搜索的弹性架构迁移

已经提出了神经架构搜索(NAS)方法来将人类专家从繁琐的架构工程中解脱出来。然而,由于巨大的计算资源消耗问题,当前的大多数方法都受限于小规模搜索。同时,由于不同数据集之间的差异,将在小数据集上搜索的架构直接应用于大数据集往往无法保证性能。这种限制阻碍了 NAS 在大规模任务上的广泛使用。为了克服这个障碍,我们提出了一种用于加速大规模 NAS (EAT-NAS) 的弹性架构传输机制。在我们的实现中,首先在一个小数据集上搜索架构,例如 CIFAR-10。选择最好的一个作为基本架构。在大型数据集上的搜索过程,例如 ImageNet,以基本架构作为种子进行初始化。在基本架构的帮助下,可以加速大规模搜索过程。我们不仅提出了一种 NAS 方法,而且提出了一种架构级迁移学习的机制。在我们的实验中,我们获得了两个最终模型 EATNet-A 和 EATNet-B,它们在 ImageNet 上的竞争准确率分别为 75.5% 和 75.6%。在相同的设置下,这两个模型也超过了在 ImageNet 上从头开始搜索的模型。在计算成本方面,EAT-NAS 使用 8 个 TITAN X GPU 只需要不到 5 天的时间,这大大低于最先进的大规模 NAS 方法的计算消耗。我们不仅提出了一种 NAS 方法,而且提出了一种架构级迁移学习的机制。在我们的实验中,我们获得了两个最终模型 EATNet-A 和 EATNet-B,它们在 ImageNet 上的竞争准确率分别为 75.5% 和 75.6%。在相同的设置下,这两个模型也超过了在 ImageNet 上从头开始搜索的模型。在计算成本方面,EAT-NAS 使用 8 个 TITAN X GPU 只需要不到 5 天的时间,这大大低于最先进的大规模 NAS 方法的计算消耗。我们不仅提出了一种 NAS 方法,而且提出了一种架构级迁移学习的机制。在我们的实验中,我们获得了两个最终模型 EATNet-A 和 EATNet-B,它们在 ImageNet 上的竞争准确率分别为 75.5% 和 75.6%。在相同的设置下,这两个模型也超过了在 ImageNet 上从头开始搜索的模型。在计算成本方面,EAT-NAS 使用 8 个 TITAN X GPU 只需要不到 5 天的时间,这大大低于最先进的大规模 NAS 方法的计算消耗。

更新日期:2021-08-12
down
wechat
bug