当前位置: X-MOL 学术Front. Comput. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A revisit to MacKay algorithm and its application to deep network compression
Frontiers of Computer Science ( IF 4.2 ) Pub Date : 2020-01-03 , DOI: 10.1007/s11704-019-8390-z
Chune Li , Yongyi Mao , Richong Zhang , Jinpeng Huai

An iterative procedure introduced in MacKay’s evidence framework is often used for estimating the hyperparameter in empirical Bayes. Together with the use of a particular form of prior, the estimation of the hyperparameter reduces to an automatic relevance determination model, which provides a soft way of pruning model parameters. Despite the effectiveness of this estimation procedure, it has stayed primarily as a heuristic to date and its application to deep neural network has not yet been explored. This paper formally investigates the mathematical nature of this procedure and justifies it as a well-principled algorithm framework, which we call the MacKay algorithm. As an application, we demonstrate its use in deep neural networks, which have typically complicated structure with millions of parameters and can be pruned to reduce the memory requirement and boost computational efficiency. In experiments, we adopt MacKay algorithm to prune the parameters of both simple networks such as LeNet, deep convolution VGG-like networks, and residual netowrks for large image classification task. Experimental results show that the algorithm can compress neural networks to a high level of sparsity with little loss of prediction accuracy, which is comparable with the state-of-the-art.

中文翻译:

回顾MacKay算法及其在深度网络压缩中的应用

MacKay的证据框架中引入的迭代过程通常用于估计经验贝叶斯中的超参数。结合使用特定形式的先验,超参数的估计减少为自动相关性确定模型,这提供了一种修剪模型参数的软方法。尽管此估计程序有效,但迄今为止,它一直主要作为一种启发式方法使用,尚未探索其在深度神经网络中的应用。本文正式研究了该过程的数学性质,并证明了它是一个公认的算法框架,我们称之为MacKay算法。作为应用,我们演示了其在深度神经网络中的用途,它们通常具有数以百万计的参数,结构复杂,可以修剪以减少内存需求并提高计算效率。在实验中,我们采用MacKay算法来修剪LeNet等简单网络,类似于VGG的深度卷积网络以及用于大型图像分类任务的残差网络的参数。实验结果表明,该算法可以将神经网络压缩到较高的稀疏度,而预测精度几乎没有损失,这与最新技术相当。
更新日期:2020-01-03
down
wechat
bug