当前位置: X-MOL 学术IEEE J. Emerg. Sel. Top. Circuits Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SWANN: Small-World Architecture for Fast Convergence of Neural Networks
IEEE Journal on Emerging and Selected Topics in Circuits and Systems ( IF 3.7 ) Pub Date : 2021-11-04 , DOI: 10.1109/jetcas.2021.3125309
Mojan Javaheripi , Bita Darvish Rouhani , Farinaz Koushanfar

On-device intelligence has become increasingly widespread in the modern smart application landscape. A standing challenge for the applicability of on- device intelligence is the excessively high computation cost of training highly accurate Deep Learning (DL) models. These models require a large number of training iterations to reach a high convergence accuracy, hindering their applicability to resource-constrained embedded devices. This paper proposes a novel transformation which changes the topology of the DL architecture to reach an optimal cross-layer connectivity. This, in turn, significantly reduces the number of training iterations required for reaching a target accuracy. Our transformation leverages the important observation that for a set level of accuracy, convergence is fastest when network topology reaches the boundary of a Small-World Network. Small-world graphs are known to possess a specific connectivity structure that enables enhanced signal propagation among nodes. Our small-world models, called SWANNs, provide several intriguing benefits: they facilitate data (gradient) flow within the network, enable feature-map reuse by adding long-range connections and accommodate various network architectures/datasets. Compared to densely connected networks (e.g., DenseNets), SWANNs require a substantially fewer number of training parameters while maintaining a similar level of classification accuracy. We evaluate our networks on various DL model architectures and image classification datasets, namely, MNIST, CIFAR10, CIFAR100, and ImageNet. Our experiments demonstrate an average of ≈2.1×\approx 2.1\times improvement in convergence speed to the desired accuracy.

中文翻译:


SWANN:用于神经网络快速收敛的小世界架构



设备端智能在现代智能应用领域已变得越来越普遍。设备端智能的适用性面临的一个长期挑战是训练高精度深度学习(DL)模型的计算成本过高。这些模型需要大量的训练迭代才能达到较高的收敛精度,阻碍了它们在资源受限的嵌入式设备中的适用性。本文提出了一种新颖的转换,它改变了深度学习架构的拓扑,以达到最佳的跨层连接。这反过来又显着减少了达到目标精度所需的训练迭代次数。我们的转换利用了一个重要的观察结果,即对于设定的精度水平,当网络拓扑达到小世界网络的边界时,收敛速度最快。众所周知,小世界图拥有特定的连接结构,可以增强节点之间的信号传播。我们的小世界模型(称为 SWANN)提供了几个有趣的好处:它们促进网络内的数据(梯度)流动,通过添加远程连接来实现特征图重用,并适应各种网络架构/数据集。与密集连接的网络(例如,DenseNet)相比,SWANN 需要的训练参数数量要少得多,同时保持相似的分类精度水平。我们在各种深度学习模型架构和图像分类数据集(即 MNIST、CIFAR10、CIFAR100 和 ImageNet)上评估我们的网络。我们的实验表明,达到所需精度的收敛速度平均提高了约 2.1×\约 2.1\倍。
更新日期:2021-11-04
down
wechat
bug