当前位置: X-MOL 学术J. Real-Time Image Proc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient convolutional neural network with multi-kernel enhancement features for real-time facial expression recognition
Journal of Real-Time Image Processing ( IF 2.9 ) Pub Date : 2021-03-20 , DOI: 10.1007/s11554-021-01088-w
Minze Li , Xiaoxia Li , Wei Sun , Xueyuan Wang , Shunli Wang

Facial expressions are the most direct external manifestation of personal emotions. Different from other pattern recognition problems, the feature difference between facial expressions is smaller. The general methods are difficult to effectively characterize the feature difference, or their parameters are too large to realize real-time processing. This paper proposes a lightweight mobile architecture and a multi-kernel feature facial expression recognition network, which can take into account the speed and accuracy of real-time facial expression recognition. First, a multi-kernel convolution block is designed by using three depthwise separable convolution kernels of different sizes in parallel. The small and the large kernels can extract local details and edge contour information of facial expressions, respectively. Then, the multi-channel information is fused to obtain multi-kernel enhancement features to better describe the differences between facial expressions. Second, a "Channel Split" operation is performed on the input of the multi-kernel convolution block, which can avoid repeated extraction of invalid information and reduce the amount of parameters to one-third of the original. Finally, a lightweight multi-kernel feature expression recognition network is designed by alternately using multi-kernel convolution blocks and depthwise separable convolutions to further improve the feature representation ability. Experimental results show that the proposed network achieves high accuracy of 73.3 and 99.5% on FER-2013 and CK + datasets, respectively. Furthermore, it achieves a speed of 78 frames per second on 640 × 480 video. It is superior to other state-of-the-art methods in terms of speed and accuracy.



中文翻译:

具有多核增强功能的高效卷积神经网络,用于实时面部表情识别

面部表情是个人情绪最直接的外部表现。与其他模式识别问题不同,面部表情之间的特征差异较小。通用方法难以有效地表征特征差异,或者它们的参数太大而无法实现实时处理。本文提出了一种轻量级的移动架构和一个具有多内核特征的面部表情识别网络,该网络可以考虑实时面部表情识别的速度和准确性。首先,通过并行使用三个不同大小的深度可分离卷积内核来设计多内核卷积块。小核和大核可以分别提取面部表情的局部细节和边缘轮廓信息。然后,融合多通道信息以获得多内核增强功能,以更好地描述面部表情之间的差异。其次,对多内核卷积块的输入执行“通道拆分”操作,这可以避免重复提取无效信息,并将参数数量减少到原始数量的三分之一。最后,通过交替使用多核卷积块和深度可分离卷积,设计了一种轻量级的多核特征表达识别网络,以进一步提高特征表示能力。实验结果表明,所提出的网络在FER-2013和CK +数据集上的准确率分别为73.3和99.5%。此外,它在640×480视频上可达到每秒78帧的速度。

更新日期:2021-03-21
down
wechat
bug