A discriminative deep association learning for facial expression recognition

Jin, Xing; Sun, Wenyun; Jin, Zhong

doi:10.1007/s13042-019-01024-2

A discriminative deep association learning for facial expression recognition

Original Article
Published: 23 October 2019

Volume 11, pages 779–793, (2020)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

464 Accesses
18 Citations
Explore all metrics

Abstract

Deep learning based facial expression recognition becomes more successful in many applications. However, the lack of labeled data is still a bottleneck for better recognition performance. Thus, it is of practical significance to exploit the rich unlabeled data for training deep neural networks (DNNs). In this paper, we propose a novel discriminative deep association learning (DDAL) framework. The unlabeled data is provided to train the DNNs with the labeled data simultaneously, in a multi-loss deep network based on association learning. Moreover, the discrimination loss is also utilized to ensure intra-class clustering and inter-class centers separating. Furthermore, a large synthetic facial expression dataset is generated and used as unlabeled data. By exploiting association learning mechanism on two facial expression datasets, competitive results are obtained. By utilizing synthetic data, the performance is increased clearly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Facial Expression Recognition Based on Depth Fusion and Discriminative Association Learning

Article 29 January 2022

Weighted contrastive learning using pseudo labels for facial expression recognition

Article 26 August 2022

Blended Emotion in-the-Wild: Multi-label Facial Expression Recognition Using Crowdsourced Annotations and Deep Locality Feature Learning

Article 29 November 2018

References

Wan M, Yang G, Gai S, Yang Z (2017) Two-dimensional discriminant locality preserving projections (2ddlpp) and its application to feature extraction via fuzzy set. Multimedia Tools Appl 76(1):355–371
Article Google Scholar
Wan M, Li M, Yang G, Gai S, Jin Z (2014) Feature extraction using two-dimensional maximum embedding difference. Inf Sci 274:55–69
Article Google Scholar
Wan M, Lai Z, Yang G, Yang Z, Zhang F, Zheng H (2017) Local graph embedding based on maximum margin criterion via fuzzy set. Fuzzy Sets Syst 318:120–131
Article MathSciNet Google Scholar
Lai Z, Wong WK, Xu Y, Yang J, Zhang D (2015) Approximate orthogonal sparse embedding for dimensionality reduction. IEEE Trans Neural Netw Learn Syst 27(4):723–735
Article MathSciNet Google Scholar
Lai Z, Xu Y, Chen Q, Yang J, Zhang D (2014) Multilinear sparse principal component analysis. IEEE Trans Neural Netw Learn Syst 25(10):1942–1950
Article Google Scholar
Kahou SE, Pal C, Bouthillier X, Froumenty P, Gülçehre Ç, Memisevic R, Vincent P, Courville A, Bengio Y, Ferrari RC, et al. (2013) Combining modality specific deep neural networks for emotion recognition in video. In: Proceedings of the 15th ACM on International Conference on Multimodal Interaction ACM, pp 543–550
Levi G, Hassner T (2015) Emotion recognition in the wild via convolutional neural networks and mapped binary patterns. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction ACM, pp 503–510
Zeng N, Zhang H, Song B, Liu W, Li Y, Dobaie AM (2018) Facial expression recognition via learning deep sparse autoencoders. Neurocomputing 273:643–649
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp 1097–1105
Simonyan, K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1–9
Bazrafkan S, Nedelcu T, Filipczuk P, Corcoran P (2017) Deep learning for facial expression recognition: a step closer to a smartphone that knows your moods. In: 2017 IEEE International Conference on Consumer Electronics (ICCE), pp 217–220
Kaya H, Gürpınar F, Salah AA (2017) Video-based emotion recognition in the wild using deep transfer learning and score fusion. Image Vision Comput 65:66–75
Article Google Scholar
Knyazev B, Shvetsov R, Efremova N, Kuharenko A (2017) Convolutional neural networks pretrained on large face recognition datasets for emotion classification from video. arXiv preprint arXiv:1711.04598
Ding H, Zhou SK, Chellappa R (2017) Facenet2expnet: Regularizing a deep face recognition net for expression recognition. In: 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pp 118–126
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in neural information processing systems, pp 2234–2242
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434
Chang X, Nie F, Yang Y, Huang H (2014) A convex formulation for semi-supervised multi-label feature selection. In: AAAI, pp 1171–1177
Gao Y, Ma J, Yuille AL (2017) Semi-supervised sparse representation based classification for face recognition with insufficient labeled samples. IEEE Trans Image Process 26(5):2545–2560
Article MathSciNet Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Lee DH (2013) Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on Challenges in Representation Learning, ICML, vol. 3, p 2
Wen J, Xu Y, Li Z, Ma Z, Xu Y (2018) Inter-class sparsity based discriminative least square regression. Neural Netw 102:36–47
Article Google Scholar
Wen J, Fang X, Cui J, Fei L, Yan K, Chen Y, Xu Y (2018) Robust sparse linear discriminant analysis. IEEE Trans Circuits Syst Video Technol 29(2):390–403
Article Google Scholar
Roesch EB, Tamarit L, Reveret L, Grandjean D, Sander D, Scherer KR (2011) Facsgen: a tool to synthesize emotional facial expressions through systematic manipulation of facial action units. J Nonverbal Behav 35(1):1–16
Article Google Scholar
Ekman P, Rosenberg EL (1997) What the face reveals: basic and applied studies of spontaneous expression using the Facial Action Coding System (FACS). Oxford University Press, Oxford
Google Scholar
Li J, Zhang D, Zhang J, Zhang J, Li T, Xia Y, Yan Q, Xun L (2017) Facial expression recognition with faster R-CNN. Procedia Comput Sci 107:135–140
Article Google Scholar
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Hu P, Cai D, Wang S, Yao A, Chen Y (2017) Learning supervised scoring ensemble for emotion recognition in the wild. In: Proceedings of the 19th ACM International Conference on Multimodal Interaction, pp 553–560
Ji S, Xu W, Yang M, Yu K (2013) 3D convolutional neural networks for human action recognition. IEEE Trans Pattern Anal Mach Intell 35(1):221–231
Article Google Scholar
Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 4489–4497
Pons G, Masip D (2018) Multi-task, multi-label and multi-domain learning with residual convolutional networks for emotion recognition. arXiv preprint arXiv:1802.06664
Cohen I, Sebe N, Cozman FG, Huang TS (2003) Semi-supervised learning for facial expression recognition. In: Proceedings of the 5th ACM SIGMM International Workshop on Multimedia Information Retrieval, pp 17–22
Zhang Z, Ringeval F, Dong B, Coutinho E, Marchi E, Schüller B (2016) Enhanced semi-supervised learning for multimodal emotion recognition. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 5185–5189
Du C, Du C, Li J, Zheng Wl, Lu Bl, He H (2017) Semi-supervised bayesian deep multi-modal emotion recognition. arXiv preprint arXiv:1704.07548
Haeusser P, Mordvintsev A, Cremers D (2017) Learning by association–a versatile semi-supervised training method for neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 89–98
Haeusser P, Frerix T, Mordvintsev A, Cremers D (2017) Associative domain adaptation. In: Proceedings of the IEEE Conference on International Conference on Computer Vision (ICCV), pp 2765–2773
Cai J, Meng Z, Khan AS, Li Z, O’Reilly J, Tong Y (2018) Island loss for learning discriminative features in facial expression recognition. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition, pp 302–309
Langner O, Dotsch R, Bijlstra G, Wigboldus DH, Hawk ST, Van Knippenberg A (2010) Presentation and validation of the radboud faces database. Cognit Emotion 24(8):1377–1388
Article Google Scholar
Zhao G, Huang X, Taini M, Li SZ, PietikäInen M (2011) Facial expression recognition from near-infrared videos. Image Vis Comput 29(9):607–619
Article Google Scholar
Krinidis S, Pitas I (2006) Facial expression synthesis through facial expressions statistical analysis. In: 2006 14th European Signal Processing Conference, pp 1–5
Abbasnejad I, Sridharan S, Nguyen D, Denman S, Fookes C, Lucey S (2017) Using synthetic data to improve facial expression analysis with 3d convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1609–1618
Zhou Y, Shi BE (2017) Photorealistic facial expression synthesis by the conditional difference adversarial autoencoder. In: 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII), pp 370–376
Kulkarni TD, Whitney WF, Kohli P, Tenenbaum J (2015) Deep convolutional inverse graphics network. In: Advances in Neural information processing systems, pp 2539–2547
Dosovitskiy A, Tobias Springenberg J, Brox T (2015) Learning to generate chairs with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1538–1546
King DE (2009) Dlib-ml: a machine learning toolkit. J Mach Learn Res 10(Jul):1755–1758
Google Scholar
Sagonas C, Antonakos E, Tzimiropoulos G, Zafeiriou S, Pantic M (2016) 300 faces in-the-wild challenge: database and results. Image Vis Comput 47:3–18
Article Google Scholar
Liu W, Zhang H, Tao D, Wang Y, Lu K (2016) Large-scale paralleled sparse principal component analysis. Multimedia Tools Appl 75(3):1481–1493
Article Google Scholar
Sun W, Zhao H, Jin Z (2017) An efficient unconstrained facial expression recognition algorithm based on stack binarized auto-encoders and binarized neural networks. Neurocomputing 267:385–395
Article Google Scholar
Moeini A, Moeini H (2015) Multimodal facial expression recognition based on 3D face reconstruction from 2D images. In: Face and facial expression recognition from real world videos, Springer, pp 46–57
Sun W, Zhao H, Jin Z (2018) A visual attention based roi detection method for facial expression recognition. Neurocomputing 296:12–22
Article Google Scholar
Scovanner P, Ali S, Shah M (2007) A 3-dimensional sift descriptor and its application to action recognition. In: Proceedings of the 15th ACM international conference on Multimedia, pp 357–360
Cugu I, Sener E, AkbaS E (2017) Microexpnet: An extremely small and fast model for expression recognition from frontal face images. arXiv preprint arXiv:1711.07011
Klaser A, Marszałek M, Schmid C (2008) A spatio-temporal descriptor based on 3d-gradients. In: BMVC 2008-19th British Machine Vision Conference. British Machine Vision Association, p 275-1
Jung H, Lee S, Yim J, Park S, Kim J (2015) Joint fine-tuning in deep neural networks for facial expression recognition. In: Proceedings of the IEEE international conference on computer vision, pp 2983–2991

Download references

Acknowledgements

This work is partially supported by National Natural Science Foundation of China under Grant Nos. 61872188, U1713208, 61602244, 61672287, 61702262, 61773215. Meanwhile, this work is partially supported by China Postdoctoral Science Foundation under Grant No.2018M643183.

Author information

Authors and Affiliations

School of Computer Science and Engineering and Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, Nanjing University of Science and Technology, Nanjing, 210094, Jiangsu, China
Xing Jin & Zhong Jin
College of Electronics and Information Engineering, Shenzhen University, Shenzhen, 518060, Guangdong, China
Wenyun Sun

Authors

Xing Jin
View author publications
You can also search for this author in PubMed Google Scholar
Wenyun Sun
View author publications
You can also search for this author in PubMed Google Scholar
Zhong Jin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhong Jin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jin, X., Sun, W. & Jin, Z. A discriminative deep association learning for facial expression recognition. Int. J. Mach. Learn. & Cyber. 11, 779–793 (2020). https://doi.org/10.1007/s13042-019-01024-2

Download citation

Received: 18 March 2019
Accepted: 11 October 2019
Published: 23 October 2019
Issue Date: April 2020
DOI: https://doi.org/10.1007/s13042-019-01024-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A discriminative deep association learning for facial expression recognition

Abstract

Access this article

Similar content being viewed by others

Facial Expression Recognition Based on Depth Fusion and Discriminative Association Learning

Weighted contrastive learning using pseudo labels for facial expression recognition

Blended Emotion in-the-Wild: Multi-label Facial Expression Recognition Using Crowdsourced Annotations and Deep Locality Feature Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A discriminative deep association learning for facial expression recognition

Abstract

Access this article

Similar content being viewed by others

Facial Expression Recognition Based on Depth Fusion and Discriminative Association Learning

Weighted contrastive learning using pseudo labels for facial expression recognition

Blended Emotion in-the-Wild: Multi-label Facial Expression Recognition Using Crowdsourced Annotations and Deep Locality Feature Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation