research-article

Being the Center of Attention: A Person-Context CNN Framework for Personality Recognition

Authors:
Dario Dotti

Maastricht University, The Netherlands, EN

Maastricht University, The Netherlands, EN
View Profile

,
Mirela Popa

Maastricht University, The Netherlands, EN

Maastricht University, The Netherlands, EN
View Profile

,
Stylianos Asteriadis

Maastricht University, The Netherlands, EN

Maastricht University, The Netherlands, EN
View Profile

ACM Transactions on Interactive Intelligent Systems Volume 10 Issue 3Article No.: 19pp 1–20https://doi.org/10.1145/3338245

Published:09 November 2020Publication History

ACM Transactions on Interactive Intelligent Systems

Abstract

This article proposes a novel study on personality recognition using video data from different scenarios. Our goal is to jointly model nonverbal behavioral cues with contextual information for a robust, multi-scenario, personality recognition system. Therefore, we build a novel multi-stream Convolutional Neural Network (CNN) framework, which considers multiple sources of information. From a given scenario, we extract spatio-temporal motion descriptors from every individual in the scene, spatio-temporal motion descriptors encoding social group dynamics, and proxemics descriptors to encode the interaction with the surrounding context. All the proposed descriptors are mapped to the same feature space facilitating the overall learning effort. Experiments on two public datasets demonstrate the effectiveness of jointly modeling the mutual Person-Context information, outperforming the state-of-the art-results for personality recognition in two different scenarios. Last, we present CNN class activation maps for each personality trait, shedding light on behavioral patterns linked with personality attributes.

References

Henri Achten. 2013. Buildings with an attitude. In -Proceedings of the 31st eCAADe Conference on Computation and Performance, R. Stouffs and S. Andsariyildiz (eds.), Vol. 1. 477--485.Google Scholar
Xavier Alameda-Pineda, Jacopo Staiano, Ramanathan Subramanian, Ligia Batrinca, Elisa Ricci, Bruno Lepri, Oswald Lanz, and Nicu Sebe. 2016. Salsa: A novel dataset for multimodal group behavior analysis. IEEE Trans. Pattern Anal. Mach. Intell. 38, 8 (2016), 1707--1720.Google ScholarCross Ref
Xavier Alameda-Pineda, Yan Yan, Elisa Ricci, Oswald Lanz, and Nicu Sebe. 2015. Analyzing free-standing conversational groups: A multimodal approach. In Proceedings of the 23rd ACM International Conference on Multimedia. ACM, 5--14. Google ScholarDigital Library
Sharifa Alghowinem, Roland Goecke, Michael Wagner, Julien Epps, Matthew Hyett, Gordon Parker, and Michael Breakspear. 2016. Multimodal depression detection: Fusion analysis of paralinguistic, head pose and eye gaze behaviors. IEEE Trans. Affect. Comput. 9.4 (2016), 478--490.Google Scholar
Timur M. Bagautdinov, Alexandre Alahi, François Fleuret, Pascal Fua, and Silvio Savarese. 2017. Social scene understanding: End-to-end multi-person action localization and collective activity recognition. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’17). 3425--3434.Google ScholarCross Ref
Jeffrey D. Banfield and Adrian E. Raftery. 1993. Model-based gaussian and non-gaussian clustering. Biometrics 49.3 (1993), 803--821.Google Scholar
Cigdem Beyan, Muhammad Shahid, and Vittorio Murino. 2018. Investigation of small group social interactions using deep visual activity-based nonverbal features. In Proceedings of the ACM Multimedia Conference on Multimedia Conference. ACM, 311--319. Google ScholarDigital Library
Sovan Biswas and Juergen Gall. 2018. Structural recurrent neural network (SRNN) for group activity analysis. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’18). IEEE, 1625--1632.Google ScholarCross Ref
Jack Block and Jeanne H. Block. 2014. The role of ego-control and ego-resiliency in the organization of behavior. In Development of Cognition, Affect, and Social Relations. Psychology Press, 49--112.Google Scholar
Kevin W. Bowyer. 2004. Face recognition technology: Security versus privacy. IEEE Technol. Soc. Mag. 23, 1 (2004), 9--19.Google ScholarCross Ref
G. Bradski. 2000. The opencv library. Dr. Dobb’s J. Softw. Tools (2000).Google Scholar
Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7291--7299.Google ScholarCross Ref
Oya Celiktutan, Efstratios Skordos, and Hatice Gunes. 2017. Multimodal human-human-robot interactions (mhhri) dataset for studying personality and engagement. IEEE Trans. Affect. Comput. (2017).Google Scholar
Florence Corpet. 1988. Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res. 16, 22 (1988), 10881--10890.Google ScholarCross Ref
Marco Cristani, Loris Bazzani, Giulia Paggetti, Andrea Fossati, Diego Tosato, Alessio Del Bue, Gloria Menegaz, and Vittorio Murino. 2011. Social interaction discovery by statistical analysis of f-formations. In Proceedings of the British Machine Vision Conference (BMVC’11), Vol. 2. 4.Google ScholarCross Ref
Marco Cristani, Vittorio Murino, and Alessandro Vinciarelli. 2010. Socially intelligent surveillance and monitoring: Analysing social dimensions of physical space. In Proceedings of the IEEE Computer Vision and Pattern Recognition Workshops (CVPRW’10). IEEE, 51--58.Google ScholarCross Ref
Marco Cristani, Giulia Paggetti, Alessandro Vinciarelli, Loris Bazzani, Gloria Menegaz, and Vittorio Murino. 2011. Towards computational proxemics: Inferring social relations from interpersonal distances. In Proceedings of the IEEE 3rd International Conference on Privacy, Security, Risk and Trust (PASSAT’11) and IEEE 3rd International Conference on Social Computing (SocialCom’11). IEEE, 290--297.Google ScholarCross Ref
Dario Dotti, Mirela Popa, and Stylianos Asteriadis. 2018. Behavior and personality analysis in a nonsocial context dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2354--2362.Google ScholarCross Ref
Nicholas Epley, Adam Waytz, and John T. Cacioppo. 2007. On seeing human: A three-factor theory of anthropomorphism.Psychol. Rev. 114, 4 (2007), 864.Google ScholarCross Ref
Yağmur Güçlütürk, Umut Güçlü, Xavier Baro, Hugo Jair Escalante, Isabelle Guyon, Sergio Escalera, Marcel A. J. Van Gerven, and Rob Van Lier. 2017. Multimodal first impression analysis with deep residual networks. IEEE Trans. Affect. Comput. 9.3 (2017), 316--329. Google ScholarDigital Library
Wilfred J. Hansen. 1971. User engineering principles for interactive systems. In Proceedings of the Fall Joint Computer Conference. ACM, 523--532. Google ScholarDigital Library
Ann Hutchinson. 1954. Labanotation. J. Aesthet. Art Crit. 13, 2 (1954), 276--277.Google ScholarCross Ref
Oliver P. John and Sanjay Srivastava. 1999. The big five trait taxonomy: History, measurement, and theoretical perspectives. Handbook Personal. Theory Res. 2, 1999 (1999), 102--138.Google Scholar
Qiuhong Ke, Mohammed Bennamoun, Senjian An, Ferdous Sohel, and Farid Boussaid. 2017. A new representation of skeleton sequences for 3D action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE, 4570--4579.Google ScholarCross Ref
Markus Koppensteiner. 2013. Motion cues that make an impression: Predicting perceived personality by minimal motion information. J. Exper. Soc. Psychol. 49, 6 (2013), 1137--1143.Google ScholarCross Ref
Shun Lau and Youyan Nie. 2008. Interplay between personal goals and classroom goal structures in predicting student outcomes: A multilevel analysis of person-context interactions.J. Edu. Psycholo. 100, 1 (2008), 15.Google Scholar
Yun-Shao Lin and Chi-Chun Lee. 2018. Using interlocutor-modulated attention BLSTM to predict personality traits in small group interaction. In Proceedings of the International Conference on Multimodal Interaction. ACM, 163--169. Google ScholarDigital Library
Jian Liu, Naveed Akhtar, and Ajmal Mian. 2019. Skepxels: Spatio-temporal image representation of human skeleton joints for action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation, 2019.Google Scholar
Yu-En Lu, Sam Roberts, Pietro Lio, Robin Dunbar, and Jon Crowcroft. 2009. Size matters: Variation in personal network size, personality and effect on information transmission. In Proceedings of the International Conference on Computational Science and Engineering (CSE’09), Vol. 4. IEEE, 188--193. Google ScholarDigital Library
François Mairesse and Marilyn A. Walker. 2010. Towards personality-based user adaptation: Psychologically informed stylistic language generation. User Model. User-Adapt. Interact. 20, 3 (2010), 227--278. Google ScholarDigital Library
Marcin Marszalek, Ivan Laptev, and Cordelia Schmid. 2009. Actions in context. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’09). IEEE, 2929--2936.Google ScholarCross Ref
Christopher McCarty and H. D. Green. 2005. Personality and personal networks. In Proceedings of the 25th International Sunbelt Social Network Conference (Sunbelt’05).Google Scholar
Robert R. McCrae and Oliver P. John. 1992. An introduction to the five-factor model and its applications. J. Personal. 60, 2 (1992), 175--215.Google ScholarCross Ref
Juan Abdon Miranda-Correa, Mojtaba Khomami Abadi, Nicu Sebe, and Ioannis Patras. 2018. AMIGOS: A dataset for affect, personality and mood research on individuals and groups. IEEE Trans. Affect. Comput. (2018).Google Scholar
Hossein Mousavi, Sadegh Mohammadi, Alessandro Perina, Ryad Chellali, and Vittorio Mur. 2015. Analyzing tracklets for the detection of abnormal crowd behavior. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’15). IEEE, 148--155. Google ScholarDigital Library
Vinod Nair and Geoffrey E. Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML’10). 807--814. Google ScholarDigital Library
Víctor Ponce-López, Baiyu Chen, Marc Oliu, Ciprian Corneanu, Albert Clapés, Isabelle Guyon, Xavier Baró, Hugo Jair Escalante, and Sergio Escalera. 2016. Chalearn lap 2016: First round challenge on first impressions-dataset and results. In Proceedings of the European Conference on Computer Vision. Springer, 400--418.Google ScholarCross Ref
Beatrice Rammstedt and Oliver P. John. 2007. Measuring personality in one minute or less: A 10-item short version of the big five inventory in english and german. J. Res. Personal. 41, 1 (2007), 203--212.Google ScholarCross Ref
Kamrad Khoshhal Roudposhti and Jorge Dias. 2013. Probabilistic human interaction understanding: Exploring relationship between human body motion and the environmental context. Pattern Recogn. Lett. 34, 7 (2013), 820--830. Google ScholarDigital Library
Kamrad Khoshhal Roudposhti, Urbano Nunes, and Jorge Dias. 2016. Probabilistic social behavior analysis by exploring body motion-based patterns. IEEE Trans. Pattern Anal. Mach. Intell. 38, 8 (2016), 1679--1691.Google ScholarCross Ref
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. 2015. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 3 (2015), 211--252. Google ScholarDigital Library
Dairazalia Sanchez-Cortes, Oya Aran, Marianne Schmid Mast, and Daniel Gatica-Perez. 2011. A nonverbal behavior approach to identify emergent leaders in small groups. IEEE Trans. Multimedia 14, 3 (2011), 816--832. Google ScholarDigital Library
David P. Schmitt, Jüri Allik, Robert R. McCrae, and Verónica Benet-Martínez. 2007. The geographic distribution of big five personality traits: Patterns and profiles of human self-description across 56 nations. J. Cross-cultur. Psychol. 38, 2 (2007), 173--212.Google ScholarCross Ref
Karen Simonyan and Andrew Zisserman. 2014. Two-stream convolutional networks for action recognition in videos. In Advances in Neural Information Processing Systems. MIT Press, 568--576. Google ScholarDigital Library
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. Arxiv Preprint Arxiv:1409.1556.Google Scholar
Cees G. M. Snoek, Marcel Worring, and Arnold W. M. Smeulders. 2005. Early versus late fusion in semantic video analysis. In Proceedings of the 13th Annual ACM International Conference on Multimedia. ACM, 399--402. Google ScholarDigital Library
Mark Snyder, Jeffry A. Simpson, and Steve Gangestad. 1986. Personality and sexual relations.J. Personal. Soc. Psychol. 51, 1 (1986), 181.Google ScholarCross Ref
Adriana Tapus, Cristian Tapus, and Maja J Mataric. 2007. Hands-off therapist robot behavior adaptation to user personality for post-stroke rehabilitation therapy. In Proceedings of the IEEE International Conference on Robotics and Automation. IEEE, 1547--1553.Google ScholarCross Ref
Alessandro Vinciarelli and Gelareh Mohammadi. 2014. A survey of personality computing. IEEE Trans. Affect. Comput. 5, 3 (2014), 273--291.Google ScholarCross Ref
D. Wang, B. Subagdja, Y. Kang, A. H. Tan, and D. Zhang. 2014. Towards intelligent caring agents for aging-in-place: Issues and challenges. In Proceedings of the IEEE Symposium on Computational Intelligence for Human-like Intelligence. IEEE Computer Society, 1--8.Google Scholar
Minsi Wang, Bingbing Ni, and Xiaokang Yang. 2017. Recurrent modeling of interaction context for collective activity recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).Google ScholarCross Ref
P. Wang, W. Li, P. Ogunbona, J. Wan, and S. Escalera. 2018. RGB-D-based human motion recognition with deep learning: A survey. Comput. Vision Image Understand. 171 (2018), 118--139.Google ScholarCross Ref
Weichen Wang, Gabriella M. Harari, Rui Wang, Sandrine R. Müller, Shayan Mirjafari, Kizito Masaba, and Andrew T. Campbell. 2018. Sensing behavioral change over time: Using within-person variability features from mobile sensing to predict personality traits. Proc. ACM Interact. Mobile Wear. Ubiq. Technol. 2, 3 (2018), 141. Google ScholarDigital Library
Xiu-Shen Wei, Chen-Lin Zhang, Hao Zhang, and Jianxin Wu. 2018. Deep bimodal regression of apparent personality traits from short video sequences. IEEE Trans. Affect. Comput. 9, 3 (2018), 303--315. Google ScholarDigital Library
Daniel Weinland, Remi Ronfard, and Edmond Boyer. 2006. Free viewpoint action recognition using motion history volumes. Comput. Vis. Image Understand. 104, 2--3 (2006), 249--257. Google ScholarDigital Library
Yanna J. Weisberg, Colin G. DeYoung, and Jacob B. Hirsh. 2011. Gender differences in personality across the ten aspects of the big five. Front. Psychology 2 (2011), 178.Google ScholarCross Ref
Svante Wold, Kim Esbensen, and Paul Geladi. 1987. Principal component analysis. Chemo. Intell. Lab. Syst. 2, 1--3 (1987), 37--52.Google ScholarCross Ref
Bangpeng Yao and Li Fei-Fei. 2012. Recognizing human-object interactions in still images by modeling the mutual context of objects and human poses. IEEE Trans. Pattern Anal. Mach. Intell. 34, 9 (2012), 1691--1703. Google ScholarDigital Library
Shuai Yi, Hongsheng Li, and Xiaogang Wang. 2016. Pedestrian behavior modeling from stationary crowds with applications to intelligent surveillance. IEEE Trans. Image Process. 25, 9 (2016), 4354--4368.Google ScholarCross Ref
Gloria Zen, Bruno Lepri, Elisa Ricci, and Oswald Lanz. 2010. Space speaks: Towards socially and personality aware visual surveillance. In Proceedings of the 1st ACM International Workshop on Multimodal Pervasive Video Analysis. ACM, 37--42. Google ScholarDigital Library
Dingwen Zhang, Guangyu Guo, Dong Huang, and Junwei Han. 2018. PoseFlow: A deep motion representation for understanding human behaviors in videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6762--6770.Google ScholarCross Ref
Le Zhang, Songyou Peng, and Stefan Winkler. 2018. PersEmoN: A deep network for joint analysis of apparent personality, emotion and their relationship. Arxiv Preprint Arxiv:1811.08657.Google Scholar
Lei Zhao, Qinghua Hu, and Yucan Zhou. 2015. Heterogeneous features integration via semi-supervised multi-modal deep networks. In Proceedings of the International Conference on Neural Information Processing. Springer, 11--19.Google ScholarCross Ref
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2921--2929.Google ScholarCross Ref

Index Terms

Being the Center of Attention: A Person-Context CNN Framework for Personality Recognition
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction techniques
      1. Gestural input

Recommendations

Look! Who's Talking?: Projection of Extraversion Across Different Social Contexts
WCPR '14: Proceedings of the 2014 ACM Multi Media on Workshop on Computational Personality Recognition

Automatic classification of personality from language depends upon large quantities of relevant training data, which raises two potential problems. First, collecting personality information from the author or speaker can be invasive and expensive, ...
Read More
The role of personality in shaping social networks and mediating behavioral change

In this paper, we exploit different facets of the Friends and Family study to deal with two personality-related tasks of paramount importance for the user modeling and ubiquitous computing fields. First, we propose and validate an approach for automatic ...
Read More
Implicit User-centric Personality Recognition Based on Physiological Responses to Emotional Videos
ICMI '15: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction

We present a novel framework for recognizing personality traits based on users' physiological responses to affective movie clips. Extending studies that have correlated explicit/implicit affective user responses with Extraversion and Neuroticism traits, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Interactive Intelligent Systems Volume 10, Issue 3
Special Issue on Data-Driven Personality Modeling for Intelligent Human-Computer Interaction
September 2020
189 pages
ISSN:2160-6455
EISSN:2160-6463
DOI:10.1145/3430388
Editor:
Michelle X. Zhou
Juji, Inc., USA
Issue’s Table of Contents
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 November 2020
- Online AM: 7 May 2020
- Revised: 1 February 2020
- Accepted: 1 February 2020
- Received: 1 February 2019
Published in tiis Volume 10, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
CNN networks
Personality recognition
nonsocial behavior analysis
social behaviors analysis
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 167
  Total Downloads
- Downloads (Last 12 months)14
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Being the Center of Attention: A Person-Context CNN Framework for Personality Recognition

ACM Transactions on Interactive Intelligent Systems

Abstract

References

Cited By

Index Terms

Recommendations

Look! Who's Talking?: Projection of Extraversion Across Different Social Contexts

The role of personality in shaping social networks and mediating behavioral change

Implicit User-centric Personality Recognition Based on Physiological Responses to Emotional Videos

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Being the Center of Attention: A Person-Context CNN Framework for Personality Recognition

ACM Transactions on Interactive Intelligent Systems

Abstract

References

Cited By

Index Terms

Recommendations

Look! Who's Talking?: Projection of Extraversion Across Different Social Contexts

The role of personality in shaping social networks and mediating behavioral change

Implicit User-centric Personality Recognition Based on Physiological Responses to Emotional Videos

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media