ACM Computing Surveys ( IF 16.6 ) Pub Date : 2023-07-13 , DOI: 10.1145/3587095 JunKyu Lee, Lev Mukhanov, Amir Sabbagh Molahosseini, Umar Minhas, Yang Hua, Jesus Martinez del Rincon, Kiril Dichev, Cheol-Ho Hong, Hans Vandierendonck
Convolutional neural networks (CNNs) are used in our daily life, including self-driving cars, virtual assistants, social network services, healthcare services, and face recognition, among others. However, deep CNNs demand substantial compute resources during training and inference. The machine learning community has mainly focused on model-level optimizations such as architectural compression of CNNs, whereas the system community has focused on implementation-level optimization. In between, various arithmetic-level optimization techniques have been proposed in the arithmetic community. This article provides a survey on resource-efficient CNN techniques in terms of model-, arithmetic-, and implementation-level techniques, and identifies the research gaps for resource-efficient CNN techniques across the three different level techniques. Our survey clarifies the influence from higher- to lower-level techniques based on our resource efficiency metric definition and discusses the future trend for resource-efficient CNN research.
中文翻译:
资源高效的卷积网络:模型、算术和实现级技术的调查
卷积神经网络 (CNN) 广泛应用于我们的日常生活中,包括自动驾驶汽车、虚拟助理、社交网络服务、医疗保健服务和人脸识别等。然而,深度 CNN 在训练和推理过程中需要大量计算资源。机器学习社区主要关注模型级优化,例如 CNN 的架构压缩,而系统社区则专注于实现级优化。在此期间,算术界已经提出了各种算术级优化技术。本文从模型、算术和实现级别技术方面对资源高效 CNN 技术进行了调查,并确定了三种不同级别技术之间资源高效 CNN 技术的研究差距。