DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling,IEEE Transactions on Pattern Analysis and Machine Intelligence

当前位置： X-MOL 学术 › IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 23.6 ) Pub Date : 2020-08-31 , DOI: 10.1109/tpami.2020.3020315
xinbang zhang , Jianlong Chang , Yiwen Guo , Gaofeng MENG , Zhouchen Lin , Shiming XIANG , Chunhong Pan

Neural architecture search (NAS) is inherently subject to the gap of architectures during searching and validating. To bridge this gap effectively, we develop Differentiable ArchiTecture Approximation (DATA) with Ensemble Gumbel-Softmax (EGS) estimator and Architecture Distribution Constraint (ADC) to automatically approximate architectures during searching and validating in a differentiable manner. Technically, the EGS estimator consists of a group of Gumbel-Softmax estimators, which is capable of converting probability vectors to binary codes and passing gradients reversely, reducing the estimation bias in a differentiable way. To narrow the distribution gap between sampled architectures and supernet, further, the ADC is introduced to reduce the variance of sampling during searching. Benefiting from such modeling, architecture probabilities and network weights in the NAS model can be jointly optimized with the standard back-propagation, yielding an end-to-end learning mechanism for searching deep neural architectures in an extended search space. Conclusively, in the validating process, a high-performance architecture that approaches to the learned one during searching is readily built. Extensive experiments on various tasks including image classification, few-shot learning, unsupervised clustering, semantic segmentation and language modeling strongly demonstrate that DATA is capable of discovering high-performance architectures while guaranteeing the required efficiency. Code is available at https://github.com/XinbangZhang/DATA-NAS .

中文翻译：

数据：具有分布引导采样的可微架构近似

神经架构搜索 (NAS) 在搜索和验证过程中固有地受到架构差距的影响。为了有效地弥合这一差距，我们开发了可微架构近似 (DATA) 与集成 Gumbel-Softmax (EGS) 估计器和架构分布约束 (ADC) 在搜索和验证期间以可微分的方式自动近似架构。从技术上讲，EGS 估计器由一组 Gumbel-Softmax 估计器组成，它能够将概率向量转换为二进制代码并反向传递梯度，以可微的方式减少估计偏差。为了缩小采样架构和超网络之间的分布差距，进一步引入了ADC以减少搜索过程中采样的方差。受益于这种建模，NAS 模型中的架构概率和网络权重可以与标准的反向传播联合优化，从而产生一种端到端的学习机制，用于在扩展的搜索空间中搜索深度神经架构。最后，在验证过程中，很容易构建一种在搜索过程中接近所学架构的高性能架构。包括图像分类、小样本学习、无监督聚类、语义分割和语言建模在内的各种任务的广泛实验有力地证明，DATA 能够在保证所需效率的同时发现高性能架构。代码可在https://github.com/XinbangZhang/DATA-NAS .

更新日期：2020-08-31

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>