Skip to main content
Log in

Lightweight attention convolutional neural network through network slimming for robust facial expression recognition

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Deep convolutional neural networks (DCNNs) have achieved outstanding results in facial expression recognition (FER). However, their runtime memory and computational resource requirements make it challenging to deploy them on resource-constrained devices, such as mobile devices. In this paper, we propose a novel lightweight attention DCNN (LA-Net) for robust FER, which uses squeeze-and-excitation (SE) modules and the network slimming strategy. First, we combine the SE modules with the CNN network, which assigns a certain weight to each feature channel. This enables LA-Net to focus on learning the prominent facial features, reduce redundant information, and finally extract discriminative features from facial images. Then, we use the network slimming method to further reduce the model’s size, which results in a thin and compact network that uses less runtime memory and computational operations with minimal accuracy loss. The proposed LA-Net model can achieve 95.52%, 87.00% and 100% test accuracy on KDEF, RAF-DB and FERG-DB FER datasets, respectively. The experimental results show that the proposed method achieves better or comparable results than state-of-the-art FER methods and significantly reduces the computational cost and the number of parameters, with better generalization capability and robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Alshamsi, H., Kepuska, V., Meng, H.: Automated facial expression recognition app development on smart phones using cloud computing. In: UEMCON, pp. 577–583 (2017)

  2. Aneja, D., Colburn, A., Faigin, G., Shapiro, L., Mones, B.: Modeling stylized character expressions via deep learning. In: CV, pp. 136–153. Springer (2017)

  3. Benitez-Quiroz, C.F., Srinivasan, R., Martinez, A.M.: Emotionet: an accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: CVPR, pp. 5562–5570 (2016)

  4. Cai, J., Meng, Z., Khan, A.S., Li, Z., O’Reilly, J., Tong, Y.: Probabilistic attribute tree in convolutional neural networks for facial expression recognition. arXiv:1812.07067v1 (2018)

  5. Chu, W.S., la Torre, F.D., Cohn, J.F.: Selective transfer machine for personalized facial expression analysis. IEEE Trans. Pattern Anal. Mach. Intell. 39, 529–545 (2017)

    Article  Google Scholar 

  6. Fan, Y., Li, V., Lam, J.C.: Facial expression recognition with deeply-supervised attention network. IEEE Trans. Affect. Comput. 1 (2020)

  7. Feutry, C., Piantanida, P., Bengio, Y., Duhamel, P.: Learning anonymized representations with adversarial neural networks. arXiv:1802.09386v1 (2018)

  8. Gan, Y., Chen, J., Yang, Z., Xu, L.: Multiple attention network for facial expression recognition. IEEE Access 8, 7383–7393 (2020)

    Article  Google Scholar 

  9. Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: Ghostnet: more features from cheap operations. In: CVPR, pp. 1577–1586 (2020)

  10. Howard, A., Sandler, M., Chu, G., Chen, L.C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., Le, Q.V., Adam, H.: Searching for mobilenetv3. arXiv:1905.02244v5 (2019)

  11. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861v1 (2017)

  12. Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. 42, 2011–2023 (2020)

    Article  Google Scholar 

  13. Li, S., Deng, W., Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: CVPR, pp. 2584–2593 (2017)

  14. Li, Y., Zeng, J., Shan, S., Chen, X.: Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Trans. Image Process. 28, 2439–2450 (2019)

    Article  MathSciNet  Google Scholar 

  15. Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning efficient convolutional networks through network slimming. In: ICCV, pp. 2755–2763 (2017)

  16. Lundqvist, D., Flykt, A.,Öhman, A.: The karolinska directed emotional faces-kdef, cd rom from department of clinical neuroscience, psychology section. Karolinska Institutet, pp. 91–630 (1998)

  17. Ma, H., Celik, T.: FER-net: facial expression recognition using densely connected convolutional network. Electron. Lett. 55, 184–186 (2019)

    Article  Google Scholar 

  18. Ma, N., Zhang, X., Zheng, H.T., Sun, J.: Shufflenet v2: practical guidelines for efficient cnn architecture design. In: CV, pp. 122–138. Springer (2018)

  19. Melaugh, R., Siddique, N., Coleman, S., Yogarajah, P.: Facial expression recognition on partial facial sections. In: ISPA, pp. 193–197 (2019)

  20. Minaee, S., Abdolrashidi, A.: Deep-emotion: facial expression recognition using attentional convolutional network. arXiv:1902.01019v1 (2019)

  21. Rathi, N., Panda, P., Roy, K.: STDP-based pruning of connections and weight quantization in spiking neural networks for energy-efficient recognition. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 38, 668–677 (2019)

    Article  Google Scholar 

  22. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: CVPR, pp. 4510–4520 (2018)

  23. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: CVPR, pp. 618–626 (2017)

  24. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556v6 (2014)

  25. Singh, S., Nasoz, F.: Facial expression recognition with convolutional neural networks. In: CCWC, pp. 0324–0328 (2020)

  26. Thonglek, K., Takahashi, K., Ichikawa, K., Nakasan, C., Nakada, H., Takano, R., Iida, H.: Retraining quantized neural network models with unlabeled data. In: IJCNN, pp. 1–8 (2020)

  27. Verma, A., Singh, P., Alex, J.S.R.: Modified convolutional neural network architecture analysis for facial emotion recognition. In: IWSSIP, pp. 169–173 (2019)

  28. Wang, J., Zhao, G., Wang, D., Li, G.: Tensor completion using low-rank tensor train decomposition by Riemannian optimization. In: CAC, pp. 3380–3384 (2019)

  29. Yuan, M., Peng, Y.: CKD: cross-task knowledge distillation for text-to-image synthesis. IEEE Trans. Multimed. 22, 1955–1968 (2020)

    Article  Google Scholar 

  30. Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23, 1499–1503 (2016)

    Article  Google Scholar 

  31. Zhang, S., Pan, X., Cui, Y., Zhao, X., Liu, L.: Learning affective video features for facial expression recognition via hybrid deep learning. IEEE Access 7, 32297–32304 (2019)

    Article  Google Scholar 

  32. Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: CVPR, pp. 6848–6856 (2018)

  33. Zhao, H., Liu, Q., Yang, Y.: Transfer learning with ensemble of multiple feature representations. In: SERA, pp. 54–61 (2018)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hui Ma.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by Sichuan Provincial Science and Technology Projects (2019JDJQ0023)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, H., Celik, T. & Li, HC. Lightweight attention convolutional neural network through network slimming for robust facial expression recognition. SIViP 15, 1507–1515 (2021). https://doi.org/10.1007/s11760-021-01883-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-021-01883-9

Keywords

Navigation