Abstract
Deep convolutional neural networks (DCNNs) have achieved outstanding results in facial expression recognition (FER). However, their runtime memory and computational resource requirements make it challenging to deploy them on resource-constrained devices, such as mobile devices. In this paper, we propose a novel lightweight attention DCNN (LA-Net) for robust FER, which uses squeeze-and-excitation (SE) modules and the network slimming strategy. First, we combine the SE modules with the CNN network, which assigns a certain weight to each feature channel. This enables LA-Net to focus on learning the prominent facial features, reduce redundant information, and finally extract discriminative features from facial images. Then, we use the network slimming method to further reduce the model’s size, which results in a thin and compact network that uses less runtime memory and computational operations with minimal accuracy loss. The proposed LA-Net model can achieve 95.52%, 87.00% and 100% test accuracy on KDEF, RAF-DB and FERG-DB FER datasets, respectively. The experimental results show that the proposed method achieves better or comparable results than state-of-the-art FER methods and significantly reduces the computational cost and the number of parameters, with better generalization capability and robustness.
Similar content being viewed by others
References
Alshamsi, H., Kepuska, V., Meng, H.: Automated facial expression recognition app development on smart phones using cloud computing. In: UEMCON, pp. 577–583 (2017)
Aneja, D., Colburn, A., Faigin, G., Shapiro, L., Mones, B.: Modeling stylized character expressions via deep learning. In: CV, pp. 136–153. Springer (2017)
Benitez-Quiroz, C.F., Srinivasan, R., Martinez, A.M.: Emotionet: an accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: CVPR, pp. 5562–5570 (2016)
Cai, J., Meng, Z., Khan, A.S., Li, Z., O’Reilly, J., Tong, Y.: Probabilistic attribute tree in convolutional neural networks for facial expression recognition. arXiv:1812.07067v1 (2018)
Chu, W.S., la Torre, F.D., Cohn, J.F.: Selective transfer machine for personalized facial expression analysis. IEEE Trans. Pattern Anal. Mach. Intell. 39, 529–545 (2017)
Fan, Y., Li, V., Lam, J.C.: Facial expression recognition with deeply-supervised attention network. IEEE Trans. Affect. Comput. 1 (2020)
Feutry, C., Piantanida, P., Bengio, Y., Duhamel, P.: Learning anonymized representations with adversarial neural networks. arXiv:1802.09386v1 (2018)
Gan, Y., Chen, J., Yang, Z., Xu, L.: Multiple attention network for facial expression recognition. IEEE Access 8, 7383–7393 (2020)
Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: Ghostnet: more features from cheap operations. In: CVPR, pp. 1577–1586 (2020)
Howard, A., Sandler, M., Chu, G., Chen, L.C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., Le, Q.V., Adam, H.: Searching for mobilenetv3. arXiv:1905.02244v5 (2019)
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861v1 (2017)
Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. 42, 2011–2023 (2020)
Li, S., Deng, W., Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: CVPR, pp. 2584–2593 (2017)
Li, Y., Zeng, J., Shan, S., Chen, X.: Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Trans. Image Process. 28, 2439–2450 (2019)
Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning efficient convolutional networks through network slimming. In: ICCV, pp. 2755–2763 (2017)
Lundqvist, D., Flykt, A.,Öhman, A.: The karolinska directed emotional faces-kdef, cd rom from department of clinical neuroscience, psychology section. Karolinska Institutet, pp. 91–630 (1998)
Ma, H., Celik, T.: FER-net: facial expression recognition using densely connected convolutional network. Electron. Lett. 55, 184–186 (2019)
Ma, N., Zhang, X., Zheng, H.T., Sun, J.: Shufflenet v2: practical guidelines for efficient cnn architecture design. In: CV, pp. 122–138. Springer (2018)
Melaugh, R., Siddique, N., Coleman, S., Yogarajah, P.: Facial expression recognition on partial facial sections. In: ISPA, pp. 193–197 (2019)
Minaee, S., Abdolrashidi, A.: Deep-emotion: facial expression recognition using attentional convolutional network. arXiv:1902.01019v1 (2019)
Rathi, N., Panda, P., Roy, K.: STDP-based pruning of connections and weight quantization in spiking neural networks for energy-efficient recognition. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 38, 668–677 (2019)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: CVPR, pp. 4510–4520 (2018)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: CVPR, pp. 618–626 (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556v6 (2014)
Singh, S., Nasoz, F.: Facial expression recognition with convolutional neural networks. In: CCWC, pp. 0324–0328 (2020)
Thonglek, K., Takahashi, K., Ichikawa, K., Nakasan, C., Nakada, H., Takano, R., Iida, H.: Retraining quantized neural network models with unlabeled data. In: IJCNN, pp. 1–8 (2020)
Verma, A., Singh, P., Alex, J.S.R.: Modified convolutional neural network architecture analysis for facial emotion recognition. In: IWSSIP, pp. 169–173 (2019)
Wang, J., Zhao, G., Wang, D., Li, G.: Tensor completion using low-rank tensor train decomposition by Riemannian optimization. In: CAC, pp. 3380–3384 (2019)
Yuan, M., Peng, Y.: CKD: cross-task knowledge distillation for text-to-image synthesis. IEEE Trans. Multimed. 22, 1955–1968 (2020)
Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23, 1499–1503 (2016)
Zhang, S., Pan, X., Cui, Y., Zhao, X., Liu, L.: Learning affective video features for facial expression recognition via hybrid deep learning. IEEE Access 7, 32297–32304 (2019)
Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: CVPR, pp. 6848–6856 (2018)
Zhao, H., Liu, Q., Yang, Y.: Transfer learning with ensemble of multiple feature representations. In: SERA, pp. 54–61 (2018)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported by Sichuan Provincial Science and Technology Projects (2019JDJQ0023)
Rights and permissions
About this article
Cite this article
Ma, H., Celik, T. & Li, HC. Lightweight attention convolutional neural network through network slimming for robust facial expression recognition. SIViP 15, 1507–1515 (2021). https://doi.org/10.1007/s11760-021-01883-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-021-01883-9