skip to main content
research-article

Being the Center of Attention: A Person-Context CNN Framework for Personality Recognition

Published:09 November 2020Publication History
Skip Abstract Section

Abstract

This article proposes a novel study on personality recognition using video data from different scenarios. Our goal is to jointly model nonverbal behavioral cues with contextual information for a robust, multi-scenario, personality recognition system. Therefore, we build a novel multi-stream Convolutional Neural Network (CNN) framework, which considers multiple sources of information. From a given scenario, we extract spatio-temporal motion descriptors from every individual in the scene, spatio-temporal motion descriptors encoding social group dynamics, and proxemics descriptors to encode the interaction with the surrounding context. All the proposed descriptors are mapped to the same feature space facilitating the overall learning effort. Experiments on two public datasets demonstrate the effectiveness of jointly modeling the mutual Person-Context information, outperforming the state-of-the art-results for personality recognition in two different scenarios. Last, we present CNN class activation maps for each personality trait, shedding light on behavioral patterns linked with personality attributes.

References

  1. Henri Achten. 2013. Buildings with an attitude. In -Proceedings of the 31st eCAADe Conference on Computation and Performance, R. Stouffs and S. Andsariyildiz (eds.), Vol. 1. 477--485.Google ScholarGoogle Scholar
  2. Xavier Alameda-Pineda, Jacopo Staiano, Ramanathan Subramanian, Ligia Batrinca, Elisa Ricci, Bruno Lepri, Oswald Lanz, and Nicu Sebe. 2016. Salsa: A novel dataset for multimodal group behavior analysis. IEEE Trans. Pattern Anal. Mach. Intell. 38, 8 (2016), 1707--1720.Google ScholarGoogle ScholarCross RefCross Ref
  3. Xavier Alameda-Pineda, Yan Yan, Elisa Ricci, Oswald Lanz, and Nicu Sebe. 2015. Analyzing free-standing conversational groups: A multimodal approach. In Proceedings of the 23rd ACM International Conference on Multimedia. ACM, 5--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Sharifa Alghowinem, Roland Goecke, Michael Wagner, Julien Epps, Matthew Hyett, Gordon Parker, and Michael Breakspear. 2016. Multimodal depression detection: Fusion analysis of paralinguistic, head pose and eye gaze behaviors. IEEE Trans. Affect. Comput. 9.4 (2016), 478--490.Google ScholarGoogle Scholar
  5. Timur M. Bagautdinov, Alexandre Alahi, François Fleuret, Pascal Fua, and Silvio Savarese. 2017. Social scene understanding: End-to-end multi-person action localization and collective activity recognition. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’17). 3425--3434.Google ScholarGoogle ScholarCross RefCross Ref
  6. Jeffrey D. Banfield and Adrian E. Raftery. 1993. Model-based gaussian and non-gaussian clustering. Biometrics 49.3 (1993), 803--821.Google ScholarGoogle Scholar
  7. Cigdem Beyan, Muhammad Shahid, and Vittorio Murino. 2018. Investigation of small group social interactions using deep visual activity-based nonverbal features. In Proceedings of the ACM Multimedia Conference on Multimedia Conference. ACM, 311--319. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Sovan Biswas and Juergen Gall. 2018. Structural recurrent neural network (SRNN) for group activity analysis. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’18). IEEE, 1625--1632.Google ScholarGoogle ScholarCross RefCross Ref
  9. Jack Block and Jeanne H. Block. 2014. The role of ego-control and ego-resiliency in the organization of behavior. In Development of Cognition, Affect, and Social Relations. Psychology Press, 49--112.Google ScholarGoogle Scholar
  10. Kevin W. Bowyer. 2004. Face recognition technology: Security versus privacy. IEEE Technol. Soc. Mag. 23, 1 (2004), 9--19.Google ScholarGoogle ScholarCross RefCross Ref
  11. G. Bradski. 2000. The opencv library. Dr. Dobb’s J. Softw. Tools (2000).Google ScholarGoogle Scholar
  12. Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7291--7299.Google ScholarGoogle ScholarCross RefCross Ref
  13. Oya Celiktutan, Efstratios Skordos, and Hatice Gunes. 2017. Multimodal human-human-robot interactions (mhhri) dataset for studying personality and engagement. IEEE Trans. Affect. Comput. (2017).Google ScholarGoogle Scholar
  14. Florence Corpet. 1988. Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res. 16, 22 (1988), 10881--10890.Google ScholarGoogle ScholarCross RefCross Ref
  15. Marco Cristani, Loris Bazzani, Giulia Paggetti, Andrea Fossati, Diego Tosato, Alessio Del Bue, Gloria Menegaz, and Vittorio Murino. 2011. Social interaction discovery by statistical analysis of f-formations. In Proceedings of the British Machine Vision Conference (BMVC’11), Vol. 2. 4.Google ScholarGoogle ScholarCross RefCross Ref
  16. Marco Cristani, Vittorio Murino, and Alessandro Vinciarelli. 2010. Socially intelligent surveillance and monitoring: Analysing social dimensions of physical space. In Proceedings of the IEEE Computer Vision and Pattern Recognition Workshops (CVPRW’10). IEEE, 51--58.Google ScholarGoogle ScholarCross RefCross Ref
  17. Marco Cristani, Giulia Paggetti, Alessandro Vinciarelli, Loris Bazzani, Gloria Menegaz, and Vittorio Murino. 2011. Towards computational proxemics: Inferring social relations from interpersonal distances. In Proceedings of the IEEE 3rd International Conference on Privacy, Security, Risk and Trust (PASSAT’11) and IEEE 3rd International Conference on Social Computing (SocialCom’11). IEEE, 290--297.Google ScholarGoogle ScholarCross RefCross Ref
  18. Dario Dotti, Mirela Popa, and Stylianos Asteriadis. 2018. Behavior and personality analysis in a nonsocial context dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2354--2362.Google ScholarGoogle ScholarCross RefCross Ref
  19. Nicholas Epley, Adam Waytz, and John T. Cacioppo. 2007. On seeing human: A three-factor theory of anthropomorphism.Psychol. Rev. 114, 4 (2007), 864.Google ScholarGoogle ScholarCross RefCross Ref
  20. Yağmur Güçlütürk, Umut Güçlü, Xavier Baro, Hugo Jair Escalante, Isabelle Guyon, Sergio Escalera, Marcel A. J. Van Gerven, and Rob Van Lier. 2017. Multimodal first impression analysis with deep residual networks. IEEE Trans. Affect. Comput. 9.3 (2017), 316--329. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Wilfred J. Hansen. 1971. User engineering principles for interactive systems. In Proceedings of the Fall Joint Computer Conference. ACM, 523--532. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Ann Hutchinson. 1954. Labanotation. J. Aesthet. Art Crit. 13, 2 (1954), 276--277.Google ScholarGoogle ScholarCross RefCross Ref
  23. Oliver P. John and Sanjay Srivastava. 1999. The big five trait taxonomy: History, measurement, and theoretical perspectives. Handbook Personal. Theory Res. 2, 1999 (1999), 102--138.Google ScholarGoogle Scholar
  24. Qiuhong Ke, Mohammed Bennamoun, Senjian An, Ferdous Sohel, and Farid Boussaid. 2017. A new representation of skeleton sequences for 3D action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE, 4570--4579.Google ScholarGoogle ScholarCross RefCross Ref
  25. Markus Koppensteiner. 2013. Motion cues that make an impression: Predicting perceived personality by minimal motion information. J. Exper. Soc. Psychol. 49, 6 (2013), 1137--1143.Google ScholarGoogle ScholarCross RefCross Ref
  26. Shun Lau and Youyan Nie. 2008. Interplay between personal goals and classroom goal structures in predicting student outcomes: A multilevel analysis of person-context interactions.J. Edu. Psycholo. 100, 1 (2008), 15.Google ScholarGoogle Scholar
  27. Yun-Shao Lin and Chi-Chun Lee. 2018. Using interlocutor-modulated attention BLSTM to predict personality traits in small group interaction. In Proceedings of the International Conference on Multimodal Interaction. ACM, 163--169. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Jian Liu, Naveed Akhtar, and Ajmal Mian. 2019. Skepxels: Spatio-temporal image representation of human skeleton joints for action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation, 2019.Google ScholarGoogle Scholar
  29. Yu-En Lu, Sam Roberts, Pietro Lio, Robin Dunbar, and Jon Crowcroft. 2009. Size matters: Variation in personal network size, personality and effect on information transmission. In Proceedings of the International Conference on Computational Science and Engineering (CSE’09), Vol. 4. IEEE, 188--193. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. François Mairesse and Marilyn A. Walker. 2010. Towards personality-based user adaptation: Psychologically informed stylistic language generation. User Model. User-Adapt. Interact. 20, 3 (2010), 227--278. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Marcin Marszalek, Ivan Laptev, and Cordelia Schmid. 2009. Actions in context. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’09). IEEE, 2929--2936.Google ScholarGoogle ScholarCross RefCross Ref
  32. Christopher McCarty and H. D. Green. 2005. Personality and personal networks. In Proceedings of the 25th International Sunbelt Social Network Conference (Sunbelt’05).Google ScholarGoogle Scholar
  33. Robert R. McCrae and Oliver P. John. 1992. An introduction to the five-factor model and its applications. J. Personal. 60, 2 (1992), 175--215.Google ScholarGoogle ScholarCross RefCross Ref
  34. Juan Abdon Miranda-Correa, Mojtaba Khomami Abadi, Nicu Sebe, and Ioannis Patras. 2018. AMIGOS: A dataset for affect, personality and mood research on individuals and groups. IEEE Trans. Affect. Comput. (2018).Google ScholarGoogle Scholar
  35. Hossein Mousavi, Sadegh Mohammadi, Alessandro Perina, Ryad Chellali, and Vittorio Mur. 2015. Analyzing tracklets for the detection of abnormal crowd behavior. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’15). IEEE, 148--155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Vinod Nair and Geoffrey E. Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML’10). 807--814. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Víctor Ponce-López, Baiyu Chen, Marc Oliu, Ciprian Corneanu, Albert Clapés, Isabelle Guyon, Xavier Baró, Hugo Jair Escalante, and Sergio Escalera. 2016. Chalearn lap 2016: First round challenge on first impressions-dataset and results. In Proceedings of the European Conference on Computer Vision. Springer, 400--418.Google ScholarGoogle ScholarCross RefCross Ref
  38. Beatrice Rammstedt and Oliver P. John. 2007. Measuring personality in one minute or less: A 10-item short version of the big five inventory in english and german. J. Res. Personal. 41, 1 (2007), 203--212.Google ScholarGoogle ScholarCross RefCross Ref
  39. Kamrad Khoshhal Roudposhti and Jorge Dias. 2013. Probabilistic human interaction understanding: Exploring relationship between human body motion and the environmental context. Pattern Recogn. Lett. 34, 7 (2013), 820--830. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Kamrad Khoshhal Roudposhti, Urbano Nunes, and Jorge Dias. 2016. Probabilistic social behavior analysis by exploring body motion-based patterns. IEEE Trans. Pattern Anal. Mach. Intell. 38, 8 (2016), 1679--1691.Google ScholarGoogle ScholarCross RefCross Ref
  41. Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. 2015. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 3 (2015), 211--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Dairazalia Sanchez-Cortes, Oya Aran, Marianne Schmid Mast, and Daniel Gatica-Perez. 2011. A nonverbal behavior approach to identify emergent leaders in small groups. IEEE Trans. Multimedia 14, 3 (2011), 816--832. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. David P. Schmitt, Jüri Allik, Robert R. McCrae, and Verónica Benet-Martínez. 2007. The geographic distribution of big five personality traits: Patterns and profiles of human self-description across 56 nations. J. Cross-cultur. Psychol. 38, 2 (2007), 173--212.Google ScholarGoogle ScholarCross RefCross Ref
  44. Karen Simonyan and Andrew Zisserman. 2014. Two-stream convolutional networks for action recognition in videos. In Advances in Neural Information Processing Systems. MIT Press, 568--576. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. Arxiv Preprint Arxiv:1409.1556.Google ScholarGoogle Scholar
  46. Cees G. M. Snoek, Marcel Worring, and Arnold W. M. Smeulders. 2005. Early versus late fusion in semantic video analysis. In Proceedings of the 13th Annual ACM International Conference on Multimedia. ACM, 399--402. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Mark Snyder, Jeffry A. Simpson, and Steve Gangestad. 1986. Personality and sexual relations.J. Personal. Soc. Psychol. 51, 1 (1986), 181.Google ScholarGoogle ScholarCross RefCross Ref
  48. Adriana Tapus, Cristian Tapus, and Maja J Mataric. 2007. Hands-off therapist robot behavior adaptation to user personality for post-stroke rehabilitation therapy. In Proceedings of the IEEE International Conference on Robotics and Automation. IEEE, 1547--1553.Google ScholarGoogle ScholarCross RefCross Ref
  49. Alessandro Vinciarelli and Gelareh Mohammadi. 2014. A survey of personality computing. IEEE Trans. Affect. Comput. 5, 3 (2014), 273--291.Google ScholarGoogle ScholarCross RefCross Ref
  50. D. Wang, B. Subagdja, Y. Kang, A. H. Tan, and D. Zhang. 2014. Towards intelligent caring agents for aging-in-place: Issues and challenges. In Proceedings of the IEEE Symposium on Computational Intelligence for Human-like Intelligence. IEEE Computer Society, 1--8.Google ScholarGoogle Scholar
  51. Minsi Wang, Bingbing Ni, and Xiaokang Yang. 2017. Recurrent modeling of interaction context for collective activity recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).Google ScholarGoogle ScholarCross RefCross Ref
  52. P. Wang, W. Li, P. Ogunbona, J. Wan, and S. Escalera. 2018. RGB-D-based human motion recognition with deep learning: A survey. Comput. Vision Image Understand. 171 (2018), 118--139.Google ScholarGoogle ScholarCross RefCross Ref
  53. Weichen Wang, Gabriella M. Harari, Rui Wang, Sandrine R. Müller, Shayan Mirjafari, Kizito Masaba, and Andrew T. Campbell. 2018. Sensing behavioral change over time: Using within-person variability features from mobile sensing to predict personality traits. Proc. ACM Interact. Mobile Wear. Ubiq. Technol. 2, 3 (2018), 141. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Xiu-Shen Wei, Chen-Lin Zhang, Hao Zhang, and Jianxin Wu. 2018. Deep bimodal regression of apparent personality traits from short video sequences. IEEE Trans. Affect. Comput. 9, 3 (2018), 303--315. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Daniel Weinland, Remi Ronfard, and Edmond Boyer. 2006. Free viewpoint action recognition using motion history volumes. Comput. Vis. Image Understand. 104, 2--3 (2006), 249--257. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Yanna J. Weisberg, Colin G. DeYoung, and Jacob B. Hirsh. 2011. Gender differences in personality across the ten aspects of the big five. Front. Psychology 2 (2011), 178.Google ScholarGoogle ScholarCross RefCross Ref
  57. Svante Wold, Kim Esbensen, and Paul Geladi. 1987. Principal component analysis. Chemo. Intell. Lab. Syst. 2, 1--3 (1987), 37--52.Google ScholarGoogle ScholarCross RefCross Ref
  58. Bangpeng Yao and Li Fei-Fei. 2012. Recognizing human-object interactions in still images by modeling the mutual context of objects and human poses. IEEE Trans. Pattern Anal. Mach. Intell. 34, 9 (2012), 1691--1703. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Shuai Yi, Hongsheng Li, and Xiaogang Wang. 2016. Pedestrian behavior modeling from stationary crowds with applications to intelligent surveillance. IEEE Trans. Image Process. 25, 9 (2016), 4354--4368.Google ScholarGoogle ScholarCross RefCross Ref
  60. Gloria Zen, Bruno Lepri, Elisa Ricci, and Oswald Lanz. 2010. Space speaks: Towards socially and personality aware visual surveillance. In Proceedings of the 1st ACM International Workshop on Multimodal Pervasive Video Analysis. ACM, 37--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Dingwen Zhang, Guangyu Guo, Dong Huang, and Junwei Han. 2018. PoseFlow: A deep motion representation for understanding human behaviors in videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6762--6770.Google ScholarGoogle ScholarCross RefCross Ref
  62. Le Zhang, Songyou Peng, and Stefan Winkler. 2018. PersEmoN: A deep network for joint analysis of apparent personality, emotion and their relationship. Arxiv Preprint Arxiv:1811.08657.Google ScholarGoogle Scholar
  63. Lei Zhao, Qinghua Hu, and Yucan Zhou. 2015. Heterogeneous features integration via semi-supervised multi-modal deep networks. In Proceedings of the International Conference on Neural Information Processing. Springer, 11--19.Google ScholarGoogle ScholarCross RefCross Ref
  64. Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2921--2929.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Being the Center of Attention: A Person-Context CNN Framework for Personality Recognition

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Interactive Intelligent Systems
      ACM Transactions on Interactive Intelligent Systems  Volume 10, Issue 3
      Special Issue on Data-Driven Personality Modeling for Intelligent Human-Computer Interaction
      September 2020
      189 pages
      ISSN:2160-6455
      EISSN:2160-6463
      DOI:10.1145/3430388
      Issue’s Table of Contents

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 November 2020
      • Online AM: 7 May 2020
      • Revised: 1 February 2020
      • Accepted: 1 February 2020
      • Received: 1 February 2019
      Published in tiis Volume 10, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format