当前位置: X-MOL 学术Future Gener. Comput. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Effects of hidden layer sizing on CNN fine-tuning
Future Generation Computer Systems ( IF 6.2 ) Pub Date : 2020-12-28 , DOI: 10.1016/j.future.2020.12.020
Stefano Marrone , Cristina Papa , Carlo Sansone

Some applications have the property of being resilient, meaning that they are robust to noise (e.g. due to error) in the data. This characteristic is very useful in situations where an approximate computation allows to perform the task in less time or to deploy the algorithm on embedded hardware. Deep learning is one of the fields that can benefit from approximate computing to reduce the high number of involved parameters thanks to its impressive generalization ability. A common approach is to prune some neurons and perform an iterative re-training with the aim of both reducing the required memory and to speed-up the inference stage. In this work we propose to face CNN size reduction from a different perspective: instead of reducing the network weights or look for an approximated network very close to the Pareto frontier, we investigate whether it is possible to remove some neurons only from the fully connected layers before the network training without substantially affecting the network performance. As a case study, we will focus on “fine-tuning”, a branch of transfer learning that has shown its effectiveness especially in domains lacking effective expert-designed features. To further compact the network, we apply weight quantization to the convolutional kernels. Results show that it is possible to tailor some layers to reduce the network size, both in terms of the number of parameters to learn and required memory, without statistically affecting the performance and without the need for any additional training. Finally, we investigate to what extent the sizing operation affects the network robustness against adversarial perturbations, a set of approaches aimed at misleading deep neural networks.



中文翻译:

隐藏层大小对CNN微调的影响

某些应用程序具有可恢复的特性,这意味着它们对于数据中的噪声(例如由于错误)具有鲁棒性。此特性在近似计算的情况下非常有用允许以更少的时间执行任务或将算法部署在嵌入式硬件上。深度学习由于其令人印象深刻的泛化能力,是可以受益于近似计算以减少大量涉及参数的领域之一。一种常见的方法是修剪一些神经元并执行迭代的重新训练,以减少所需的内存并加快推理阶段。在这项工作中,我们建议从不同的角度面对CNN大小的减少:我们调查是否有可能仅从完全连接的层中删除某些神经元,而不是减少网络权重或寻找非常接近帕累托边界的近似网络网络培训之前基本上不会影响网络性能。作为案例研究,我们将专注于“微调”,这是转移学习的一个分支,已显示出其有效性,尤其是在缺乏有效的专家设计功能的领域。为了进一步压缩网络,我们将权重量化应用于卷积核。结果表明,就学习参数的数量和所需的内存而言,可以对某些层进行裁剪以减小网络大小,而不会从统计上影响性能,也不需要任何其他培训。最后,我们研究了规模调整操作在多大程度上影响了网络对抗对抗性干扰的鲁棒性,这是一组旨在误导深度神经网络的方法。

更新日期:2021-01-06
down
wechat
bug