当前位置: X-MOL 学术Int. J. Comput. Vis. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Decomposable Winograd Method for N–D Convolution Acceleration in Video Analysis
International Journal of Computer Vision ( IF 19.5 ) Pub Date : 2021-08-04 , DOI: 10.1007/s11263-021-01500-9
Di Huang 1, 2, 3 , Rui Zhang 1, 3 , Xishan Zhang 1, 3 , Xianzhuo Wang 1, 2, 3 , Pengwei Jin 1, 2, 3 , Yunji Chen 1, 2 , Fan Wu 2, 3, 4 , Ling Li 4 , Shaoli Liu 3
Affiliation  

Winograd’s minimal filtering algorithm has been widely used in 2-D Convolutional Neural Networks (CNNs) to reduce the number of multiplications for faster processing. However, it is only effective on convolutions with kernel size as \(3\) and stride as 1, because it suffers from significantly increased FLOPs and numerical accuracy problems for kernel size larger than \(3\) and fails on convolution with stride larger than 1. Worse, the extension to N–D convolution will intensify the numerical accuracy problem. These problems severely obstruct Winograd’s minimal filtering algorithm’s application to video analysis. In this paper, we propose a novel Decomposable Winograd Method (DWM) for the N–D convolution acceleration, which breaks through the limitation of original Winograd’s minimal filtering algorithm to more general convolutions. DWM decomposes kernels with large size or stride>1 to several small kernels with stride as 1 for further applying Winograd algorithm, so that DWM can reduce the number of multiplications while keeping the numerical accuracy. It enables the fast exploration of larger kernel size, larger stride value, and higher dimensions in CNNs for high performance and accuracy and even the potential for new CNNs. Comparing against the original Winograd algorithm, the proposed DWM is able to support all kinds of N–D convolutions with a speedup of \(1.44\times \)\(3.38\times \), without affecting the numerical accuracy.



中文翻译:

用于视频分析中 N-D 卷积加速的可分解 Winograd 方法

Winograd 的最小过滤算法已广泛用于二维卷积神经网络 (CNN),以减少乘法次数以加快处理速度。然而,它仅对内核大小为\(3\)且步幅为 1 的卷积有效,因为它会遭受显着增加的 FLOPs 和大于\(3\) 的内核大小的数值精度问题并且在步长大于 1 的卷积上失败。更糟糕的是,对 N-D 卷积的扩展会加剧数值精度问题。这些问题严重阻碍了 Winograd 的最小过滤算法在视频分析中的应用。在本文中,我们提出了一种新的用于 N-D 卷积加速的可分解 Winograd 方法 (DWM),它突破了原始 Winograd 的最小滤波算法对更一般卷积的限制。DWM 将大尺寸或stride>1 的kernel 分解为多个stride 为1 的小kernel 以进一步应用Winograd 算法,这样DWM 可以在保持数值精度的同时减少乘法次数。它可以快速探索更大的内核大小,更大的步幅值,以及 CNN 中更高维度的高性能和准确性,甚至是新 CNN 的潜力。与原始的 Winograd 算法相比,所提出的 DWM 能够支持所有类型的 N-D 卷积,加速比为\(1.44\times \)\(3.38\times \),不影响数值精度。

更新日期:2021-08-04
down
wechat
bug