1 Introduction

There is no single definition of the term personality. Feist and Feist [26] define personality as “a pattern of relatively permanent traits and unique characteristics that give both consistency and individuality to a person’s behaviour”. We know from daily interactions that people’s perception and behaviour are mediated by their personalities. Personality is derived from both biological and social factors. Its impact and effect have been studied in depth within interactions between humans. In human–robot interaction (HRI), personality has been identified as an important facilitator that can potentially foster interactions between robots and humans [42]. Nonetheless, the research in this area is still fragmented and not properly investigated despite its relevance [42]. Current research has focused on two main aspects of robot personality: (1) the study of the similarity and complementary attraction principle [15, 16], and (2) the development of computational models of personality traits based on verbal and non-verbal cues [1]. As very relevant, these two aspects will be investigated in the presented work.

Fig. 1
figure 1

Illustration of the main experiments. In Fig. 1a, a participant is playing the memory game with the assistance of an introverted human helper (HHI study). In Fig. 1b, a participant is playing the memory game with the assistance of an extroverted robot helper (HRI study)

Most studies have modelled personalities in robots from prototypical definitions in psychology, however, the specific context may affect the personality of an individual and should be taken into account for properly modelling it in a robot [1, 50]. Therefore, in this work, we explore whether and to what extent distinctive personality features identified during a human–human interaction (HHI) study can be modelled in a robot in the context of an assistive memory game. Our study was divided into two stages where different participants played a “match pairs” memory game receiving different degrees of assistance from a human helper (first stage) or a robot helper (second stage). In the first stage, i.e., the HHI experiment, introverted and extroverted people were selected according to the Big-Five Inventory (BFI) [33] to act as helpers. We asked them to provide the participant with hints on the basis of a pre-established set of levels of assistance. We found that participants were able to distinguish between helpers’ personalities and thus we formulated the following research question:

RQ1:

Can distinctive features observed from HHI be modelled in a robot in such a way that the user interacting with the latter can perceive its personality?

After a in-depth analysis of the recorded videos from the HHI experiment, we first modelled the most relevant verbal and non-verbal social cues in the robot. Subsequently, we developed a statistical decision-making algorithm that provides the most suitable level of assistance to the user according to the robot’s personality and state of the game. The results obtained at this second stage, i.e., the HRI experiment, show that participants recognised the robot’s different personalities with statistical significance. Furthermore, in order to evaluate the effectiveness of the developed robotic system, we formulated a second research question:

RQ2:

Can participants distinguish between a robot controlled by one of the helpers (Wizard-of-Oz, WoZ) and a fully autonomous robot?

The questionnaires administered to the participants reported that they were not able to distinguish between the WoZ robot and the autonomous one in either case (i.e., an introverted or extroverted robot).

Finally, current studies appear in disagreement on whether or not individuals prefer interacting with people or robots with their same personality since it seems to strongly depends on the context in which they are interacting. Therefore, we formulated a third research question:

RQ3:

Do participants have better performance with a helper (human or robot) with their similar/complementary personality?

The results showed that in the HHI study the participants had better performance with a human helper that had a different personality to them (complementary), whereas in the HRI study the participants had better performance with a robot helper manifesting a personality that was similar to them (similarity).

Our findings provide the first evidence on how modelling a robot personality based on human–human observations can be effective in the context of an assistive memory game (RQ1). In addition, the experimental results show that the decision-making algorithm provided useful assistance at the correct level in real-time, leading the participants to complete the game with good performance (RQ2). Lastly, although we obtained opposite results from the HHI compared to the HRI studies, we attempted to shed some lights on the similarity/complementarity principle in the memory game scenario (RQ3).

The work presented herein was developed in the context of the European project SOCRATESFootnote 1, which focuses on Interaction Quality (IQ) in Social Robotics for Eldercare [10]. In the project, we are in charge of investigating how robot personalisation can be done manually and automatically to adapt to changes in IQ. To this end, we are developing a Cognitive Assistive Robotic Framework (CARF) to administer cognitive exercises to people affected by Mild Cognitive Impairment or Alzheimer’s Disease [2]. CARF can be personalised by the caregiver who can provide it with the mental and physical impairment of the user [3]. Our framework can also be automatically personalised by the robot to provide appropriate levels of assistance to the user [4]. Robot personality is one aspect to include in our framework that can contribute to improve IQ and consequently increase the level of users’ acceptance and trust.

The main contributions of this paper are the following:

  • Modelling distinctive social cues in terms of verbal and non-verbal behaviours in a robotic system.

  • Developing a statistical decision-making algorithm for selecting assistive actions based on the robot’s personality.

  • Deploying a fully autonomous robot that employs personality traits in the context of an assistive memory game.

2 Related Work

Personality for its multifaceted nature is a very complicated aspect of human behaviour to model. Personality is characterised by a set of behaviours, cognitions, and emotional patterns [14]. Aiming to assess whether and to what extent personality can be modulated into robots, in this work we conducted two main experiments: an HHI and an HRI. In this section, we will cover how personality relates to human and then how it can be implemented into robots. Section 2.1, summarises the most relevant work on the role of personality from a psychology and HHI perspective. Section 2.2 discusses how the previous studies modelled verbal and non-verbal social cues in robots. Finally, Sect. 2.3 focuses on how personality has been deployed into robots, including an extensive analysis of previous studies that supported the similarity principle (see Sect. 2.3.1) and others which supported the complementary one (see Sect. 2.3.2).

2.1 The Role of Personality in Human–Human Interactions

In Psychology, personality refers to those characteristics of the person that account for “consistent patterns of feelings, thinking, and behaving” [40] and is generally modelled in terms of traits. Three of the most accepted models for framing personality are the Eysenck PEN model  [21], the Myers-Briggs Type Indicator (MBTI) [12], and the Big-Five Inventory (BFI) [33]. The first model structures personality in three traits, the second one in sixteen, and the third one arranges it in five traits. Nonetheless, all of the models provide information on individual behaviours. To the best of our knowledge, there is no consensus on which model describes personality better and, in most of the cases found in the literature, the results are equivalent.

Indeed, for decades, psychologists have tried to formalise a list of personality traits that defines each human as a unique individual in the sense of behaviour and experience. The incremental formalisation of personality traits resulted in a long list of attributes measured by ambiguous questions and imprecise scales. Aiming to define a general, common taxonomy of human personality, John and Srivastava [37] revised the attributes of personality and proposed the Big-Five Inventory (BFI) that defined personality along five dimensions such as (i) extroversion and introversion, (ii) agreeableness and antagonism, (iii) conscientiousness and lack of direction, (iv) neuroticism and emotional stability, and (v) openness and closeness to experience. Due to its consistency among studies, the BFI is more accepted in the psychology community as a conceptual framework. The factors underlying each dimension do not change over time or situations and influences the behaviour of people [43]. For these reasons, we decided to apply it in our experiments both for assessing users’ personality and evaluating robots’ personality.

Extroversion is a trait that defines individuals as more engaged with the external world. They used to enjoy interacting with people and tend to be enthusiastic, action-oriented individual. Agreeableness refers to people who are generally optimistic, kind, generous, trusting and trustworthy. Conscientiousness is related to how individuals manage their impulses. Neuroticism is defined as the tendency to experience negative emotions and it is related to what is called emotion instability. Openness refers to curiosity, sensitiveness and willingness to try new things.

Several studies examined the importance of the extroversion and agreeableness dimensions in representing human behaviour. Campbell et al. [25] found out that agreeableness was the personality trait that most concurred to maintain a positive interpersonal relationship in adolescents and adults. Selfhout et al. [45] evaluated the effect of the BFI personality traits on friendship selection processes. They observed that subjects with high extroversion were more prone to select more friends than those with a low value on this trait. They also observed that subjects with high agreeableness tended to be selected as friends more than low agreeableness people. Lippa et al. [29] showed that the extroversion-introversion dimension was the most observable and accurately judged trait when asking people to assess personal characteristics.

2.2 Verbal and Non-Verbal Cues in Modelling Robot’s Personality

Human’s personality can be generally expressed through verbal and non-verbal communication channels. With respect to the non-verbal social cues, personality indicators have been described in [11, 30, 34,35,36, 39, 49].

Bevaqua et al. [11] used facial expressions and gestures to model a virtual agent with features based on psychological principles. Mcrorie et al. [34] extended the work from Bevaqua et al.. They evaluated how the personality of a virtual agent, modelled in terms of facial expressions and gesture, could affect the participants’ perception. Participants were asked to rate personality profiles of the virtual agents by looking at still images or watching video clips of the agent interacting with a human. Neff et al. [35] evaluated how self-adaptors can consistently affect the perception of neuroticism. Additionally, they showed how non-verbal social cues can contribute to defining some specific aspects of personality. Participants were asked to rate the personality of the agent according to a Ten-Item Personality Inventory watching video clips. Pelachaud et al. [39] developed a model of behaviour expressivity based on gestures among six dimensions which allowed them to create gestures of different qualities. Results showed that the same gesture type can convey different meanings depending on its quality and thus on how it was interpreted. Liu et al. [30] aimed to assess whether an agent could convey through gestures a personality trait such as extroversion and introversion. In their experiments, participants were asked to evaluate an agent’s personality while watching video clips of it portraying the characteristics of extroversion or introversion. Tolins et al. [49] evaluated how an agent that changes its personality from extroversion to introversion affects the participants’ perception in terms of expressivity. They designed a storytelling experiment in which an agent presents story components, asks a person to tell the story, waits for the person to conclude the story and, finally, asks the participant to retell the entire story.

In general in these studies, authors focused mainly on the correlation between body language in both the introversion and extroversion personality traits. Characteristics of gestures and facial expressions during non-verbal communications can differ according to personality traits. Extroverted participants, for instance, generally lean forward when communicating and they perform wider gestures. Concerning the verbal channel [31, 32, 35], research has focused mainly on seeking out which indicators or features of human speech have the highest correlation with a given set of personality traits. Specifically, Mairesse et al. [31, 32] presented PERSONAGE, a language generator, which was highly personalised and whose parameters were based on psychological results. The produced text aimed to reflect some specific personality traits. Neff et al. [35] evaluated how the changes in language in terms of verbal utterances could be modulated into a virtual agent and perceived by users.

Overall, the outcomes of the presented studies state common indicators for the extroversion and introversion personality traits. Extroverted individuals have been categorised as more talkative and louder people. They typically speak faster, deliver high-pitch speech, and avoid long silent periods during dialogues. Besides those characteristics, they tend to use positive emotion words, agree and comply more frequently than introverted people. On the contrary, introverted individuals usually speak in a low voice using a smaller and direct vocabulary.

2.3 Modelling Personality in Human–Robot Interactions

A crucial aspect in HRI experiments is the establishment of how interactions between humans and robots occur. Researchers have focused on identifying factors that promote the quality of interactions, which can be assessed on the basis of the user’s performance (goal-oriented vs. experience-oriented) and user’s preferences (similarity vs complementarity) [50].

Among the factors explored in the literature that help to identify the effectiveness of interactions such as acceptance, likeability, empathy, anthropomorphism, and trust, personality has been identified as an essential factor that facilitates to understand how to improve HRI [17, 42]. We, as humans, tend to assign personality traits to a robot in a similar way as we do to other human beings [53]. Implementing personality in a robot is very complicated since personality is a result of the combination of multiple traits [41]. According to findings from HHI experiments, the extroversion-introversion dimension plays an important role in HRI among the five dimensions of the Big-Five Inventory [20].

Most of the current research on the personality in robots has focused on the extroversion dimension and how it affects the user’s behaviour and engagement over time. Ivaldi et al. [24] studied the relationship between individual factors including extroversion and attitude toward robots and the dynamics of gaze and speech produced by humans while interacting with a robot. According to their studies, the more extroverted people are, the more and longer they are willing to interact with a robot. Tapus et al. [48] evaluated the role of personality in robots in terms of extroversion and introversion in an assistive therapy process aiming to provide personalised assistance to a given patient.

Based on these studies, we limited the scope of this paper to indicators of extroversion and introversion as main drivers to model the robots’ personality for three main reasons. Firstly, verbal and non-verbal cues that characterise extroversion and introversion are well defined in literature and more directly transferable to robots (see Sect. 2.2). Secondly, the evaluation of behavioural features associated with this personality trait can be perceived and measured in relatively short interactions such as in the context of a memory game. Thirdly, and more importantly, its correlation with engagement [33] makes it a desirable trait to have in social assistive robots for cognitive exercises that aimed to be employed with older adults with cognitive impairments.

With respect to the similarity and complementarity principle in personality, a considerable number of studies have investigated this effect in HRI. However, the studies did not fully agree on whether or not a robot should be provided with the same or different personality of its human counterpart. In the next sections we will present the most representative work that support the similarity (see Sect. 2.3.1) and the complementarity principle (see Sect. 2.3.2).

2.3.1 Similarity Principle in Human–Robot Interaction

Several studies have been conducted to investigate the effect of personality similarity on the engagement level of the user during interactions. Craen et al. [15] investigated the role of the similarity attraction effect and its relationship with the perceived quality of HRIs based on a comparative analysis between the BFI and Godspeed questionnaires. In the presented experiment, participants were asked to rate 45 robotic gestures from video clips. Park et al. [38] conducted a study to evaluate the effects of a robot’s personality modelled in terms of facial expressions, and a human’s personality in a storytelling scenario. The results indicated that participants who interacted with a robot exhibiting a similar personality, felt more comfortable in the interaction than those who were exposed to a robot having a complementary personality. Similarly, Aly et al. [1] proposed a framework for generating verbal and non-verbal robot behaviour-based on the extroversion-introversion human’s personality traits. In the proposed experiment, participants were asked to interact with a NAO robot that can provide advises on restaurants in New York. The robot identified the participants’ personality from linguistics cues and it behaved in an extroverted/introverted manner combing four different non-verbal features, that were, iconic and metaphoric gestures, gaze, and posture shift, each of them linked to specific groups of words/sentences. Their findings presented evidence that extroverted participants preferred high-speed robot movements contrarily to introverted participants. Celiktutan et al. [13] examined how a robot’s behaviour and personality in the sense of extroversion and introversion affect HRIs. In their experiment, participants were asked to interact with a robot which can manifest an extroverted or introverted personality in a conversational scenario. The perceived enjoyment reported in their experiments presented a high correlation with interactions between extroverted humans and extroverted robots. However, their results were not sufficient to show any statistical correlation when participants interacted with the introverted robot. Andrist et al. [6] investigated how the robot should adapt to a specific user by modelling gaze behaviour in robots. In their experiment, participants were asked to solve a puzzle task with the assistance of an introverted or extroverted robot. They showed that personality matching had a positive effect on a user’s motivation to engage in the Tower of Hanoi puzzle. This last study, unlike the others that focused on storytelling or conversational scenarios, presented a robot in a gaming context. Although in [6], authors were interesting to evaluate only the robot’s gaze behaviour while we are modelling the robot’s gestures and speech, their results provided insights to interpret our findings.

2.3.2 Complementarity Principle In Human–Robot Interaction

Oppositely to the concept of affinity, complementarity attraction relies on the principle that individuals are more attracted to people with the different personality. Isbister et al. [23] evaluated whether people are able to interpret and respond to verbal and non-verbal cues of a virtual agent on 12 desert survival items. Their experiments showed that people tend to prefer characters whose personality is complementary to their own over characters with a similar personality. Lee et al. [28] used the sony AIBO to evaluate whether or not participants were able to identify robot’s personality, modelled in terms of introversion and extroversion, combing verbal and non-verbal social cues. In their experiment, participants were asked to interact with the robot for 25 min using a predefined set of verbal instructions. Results suggested that participants enjoyed interacting with a quadruped robot when its personality was complementary more than when the personality was similar. The same outcome was reported by De Graaf et al. [19]. In their experimental study, they tested the influence of expectation setting on the robot’s first impression on people, and their predisposition to project their own personality onto the robot. As in De Graaf et al. [19], in our work, in order to evaluate the participants’ perception of the helper’s personality, we administered them the BFI questionnaire.

Given the divergent findings from those studies, we conclude that there is no unique theory regarding the similarity and complementarity principle. Instead, the effectiveness of both theories might be related to the context of interaction  [27], as well as to the robot’s role [54], individuals expectations [19], and their attitude [7]. A further reason that psychologists pointed out from HHI studies might be the stage of the relationship. Individuals with similar personality tend to give more importance to initial attraction, while those with complementary personality rely on relationship building over time [51]. This last point might be the reason that the majority of the HRI studies reported the validity of the similarity principle. Indeed, most of these studies are based on very short interactions and very few on long-term interactions. For all these reasons, we believe that this principle deserves to be investigated in our specific scenario.

Fig. 2
figure 2

Example of the cards selected for the memory game. On the left a cards used for the warm-up session, and on the right b cards used for the experimental runs

3 Memory Game Assistive Scenario

In our experiments, we adopted the memory card game as the cognitive exercise for two main reasons. The memory game has the benefits of improving concentration and training visual and short-term memories. Furthermore, this game is a valid alternative to the ones we have employed with people with cognitive decline [2]. The memory game consists of a deck of n cards laid face down. At each turn, the user chooses two cards and turns them face up. If they are the same then that player wins the pair. If they are not, they are turned face down again and the player has to give it another shot. The game ends when the last pair has been picked up. A score based on the number of mistakes is assigned to the player.

Intending to define a suitable complexity of the game, we conducted a pre-assessment test. In the pre-assessment test, five people played the memory game at different levels of complexity in order to assess the complexity of the game based on the time to conclude the game and number of mistakes. In each level, we manipulated the pictures’ content as well as the number of the cards. We ended up defining a deck of 24 cards with 4 rows of 6 cards each as shown in Fig. 2.

Aiming to assess how different personalities can be modulated into robotic actions, our experiments were divided into two stages. In each stage, different participants played the memory game receiving different levels of assistance from a person or a robot named helper. In the first stage, a participant played the game with the assistance of a human helper (see Sect.  4), while in the second stage, a participant played the game with the assistance of a robot helper (see Sect. 6). In the first stage, the human helpers in order to provide assistance to the player had additional metadata information as shown in Fig. 3. The metadata included the solution of the current game as well as the number of flips for each card.

Fig. 3
figure 3

A screenshot of the game with the player view on the bottom and the helper’s view on the top. The latter provided with metadata about the state of the game. Note that the gap is larger than the one showed in this figure and that we used a physical object to hide this information to the player

4 Human–Human Interaction

The objectives of this first experiment were to (i) evaluate whether participants are able to distinguish between different helpers’ personalities, (ii) analyse the helpers’ verbal and non-verbal social cues during the game, and (iii) evaluate whether participants achieve better performance when playing with a helper who has a similar personality or with a helper who has a different personality. In the first stage, a participant (or player) played the memory game for three sessions, each of them assisted by a human helper with a different personality: extroverted, introverted or non-social (neutral). In order to avoid the order effect, we used the Latin square design to select the order in which the three sessions were carried out.

Before starting the game, the human helpers were trained to provide assistance according to four different levels as reported in Table 1. The assistance might be from encouragements or greetings after a successful flip such as “Congrats!” and ‘‘You are doing great!” to full assistance which indicates the solution of one trial (“The card to flip is that one.” or “Flip the second card in the first row”). The constraint on assistance into four levels was necessary for conducting a statistical analysis on the results after the experiment as well as for modelling those behaviours in the robot. However, during the game, the human helpers were allowed to give assistance at any time to the participants in an open-scope dialogue scenario without any limitation in verbal and non-verbal communication. In other words, we asked them to act and behave naturally, so as not to influence the participants’ perception of their personality. Each session lasted in average 3 mins when the users were assisted and around 5 mins without any assistance. The average total time for the three sessions including the questionnaires was around 25 minutes.

4.1 Hypotheses

We evaluated the following hypotheses:

H1::

Participants are able to identify the helper’s personality (introverted vs. extroverted) after playing with him/her only once.

H2::

Participants achieve better performance when playing with a helper who has a similar personality.

Table 1 Levels of assistance provided by the helper

In order to evaluate the first hypothesis, we asked participants to fill the BFI questionnaire to investigate their perception of the helper’s personality at the end of each session. Hypothesis H1 will help us to address RQ1. Specifically, if we can confirm H1, we can carry on the investigation, analysing the recorded sessions to label representative and discriminative features related to the extroverted and introverted helpers. Regarding H2, we aim to evaluate whether the similarity principle is valid in the context of a memory game. This hypothesis serves to address RQ3.

Fig. 4
figure 4

The HHI experimental set-up. Figure 4a and d show the experimental set-up from different perspectives, in particular in Fig. 4d we highlight the locations of the cameras (red squares). Figure 4b and c show the view from the camera located in front of the participant, whereas Fig. 4e and f show the view of the cameras located on the side of the participants. (Color figure online)

4.2 Experimental Set-up

In order to foster natural interaction between the player and the helper, and more importantly, enhance player’s concentration during the game due to cognitive memory and attention demands, a squared play-zone was built to isolate the individuals from outside distractions. The images in Fig. 4a and d illustrate a player (left) playing the memory game with the helper (right) on the Samsung SUR40 touch monitorFootnote 2 running Windows 7. The touch monitor has a high-resolution display of 1920 x 1080 (Full HD 1080p) with a screen size of 40” which is important for the player to distinguish fine differences among the cards. The width and height dimensions of 0.71 and 1.1 meters, approximately, provides a comfortable area for the player to rest their arms while playing the game as well as enough space to show metadata to the helper for assistance purposes such as the solution grid and the number of flips for each card. This information is occluded to the player by a physical object.

Four cameras were used to record audiovisual data for further analysis of verbal and non-verbal communication and behaviour during the game. Together with the metadata collected by logging actions while playing the game, we can investigate factors that may have led the helper to assist the player. The cameras utilised in our experiments are the Logitech C920 webcamFootnote 3 which can capture high-quality videos (1080p) 30 fps with audio from its embedded microphone. The red squares in Fig. 4d show where the four cameras were located. Two cameras were located frontally to capture facial expressions and gaze from players and helpers (Fig. 4b–c), whereas two other cameras were located on the side for the analysis of body movements and gestures (Fig. 4e–f).

4.3 Questionnaires

The BFI questionnaire consists of 44 questions using natural language where people can either describe themselves or other people based on the Likert scale from 1 (disagree strongly) to 5 (agree strongly). The middle scale 3 represents a neutral answer, i.e., neither agree nor disagree.

Given that our objective is to evaluate the effect of personality in terms of extroversion and introversion traits in the context of a cognitive memory game, we only adopted questions corresponding to the extroversion-introversion dimension of the BFI questionnaire and additional questions in order to conceal the aim of our experiments to the participants (See Appendix A). The additional questions were extracted from the agreeableness/antagonism dimension. Before starting the experiment, participants filled out the questionnaire based on the following statement: “I see myself as someone who...”. After playing with a human helper, the participant filled out the same BFI questionnaire but with respect to the helper: “I see the helper as someone who...”. The BFI questionnaire was also adopted to select human helpers for our experiments. The human helper procedure and further interpretation of the results of the BFI will now be explained in the followings sections.

4.4 Criteria for selecting the helpers

The selection of the helper is a crucial factor in our study as our main objective is to (i) evaluate the impact of the helper’s personality on the user’s performance and (ii) whether or not participants are able to perceive the helpers’ personality.

The introverted, non-social (neutral), and extroverted helpers were selected among 32 people from the University of Hamburg after analysing their BFI questionnaires [44]. The most extroverted user scored 32. On the contrary, the most introverted user had a score of 17. The non-social helper, who scored 24, was selected as the baseline for comparing the effects of the extroverted and introverted personalities. Another role of the non-social helper was to provide support if and only if the player was facing technical problems. In order to avoid any social discomfort, he was also allowed to say very short sentences when requested. Finally, it is worth mentioning that the selected helpers had knowledge of robotics and specifically about the NAO robot that will be adopted as robotic platform in the HRI study presented in Sect. 6.

Fig. 5
figure 5

Scores of sub-scales of the Big-Five Inventory with respect to the non-social (neutral), introverted, and extroverted helpers (* denotes .01 < p < .05, ** denotes .001 < p < .01, and *** denotes p < .001)

4.5 Results

Personality test

In this section, we report the results that address H1. Extroversion was measured by the sum score of items 1, 2R, 4R, 7, 9, 10, 13, 16R where R denotes reverse-scored items. Agreeableness was measured by the sum score of items 3, 5, 6, 8, 11, 12, 14, 15, 17. Repeated measurements were performed on the extroversion and agreeableness sum scores. Notice that an in-depth discussion of agreeableness scores is out the scope of this paper. This dimension was included in the questionnaire to add noise and conceal the aim of the experiments to the participants, and results are shown and briefly discussed for completeness. The results are reported in Fig. 5.

There was a significant effect of the personality of the human helpers on extroversion with F(2, 28) = 62.64, p < .00, and \(\eta _{p}^{2}\) = .68. The post-hoc test showed that the extroverted helper (M ± SD = 31.78 ± 2.93) got significantly higher scores than the introverted helper (M ± SD = 24.55 ± 4.57) and the non-social helper (M ± SD = 19.39 ± 5.16) (p < .001). The difference between the introverted helper and the non-social helper was significant (p < .001). There was also a significant effect of the personality of the human helpers on agreeableness with F(2, 28) = 20.33, p < .001, and \(\eta _{p}^{2}\) = .40. The post-hoc test showed that the extroverted helper (M ± SD = 37.90 ± 4.21) got significantly higher scores than the introverted helper (M ± SD = 35.29 ± 5.18), and non-social helper (M ± SD = 30.23 ± 6.49) (p < .01). Besides, the introverted helper also got significantly higher scores than the non-social helper (p < .01). In general, the above results show that participants were able to recognise and distinguish different personalities of human helpers appropriately during the experiment.

Fig. 6
figure 6

Number of mistakes made by participants when playing with the non-social (neutral), introverted, and extroverted helpers, respectively (n.s. denotes p >.05, * denotes .01 < p < .05, and ** denotes .001 < p < .01)

Memory game performance In this section, we analyse whether the helper’s personality had an impact on the participants’ performance. Data from 30 participants entered into the final statistical analysis. One sample was removed because his or her mistakes were 3 SDs higher than the statistics of the group. The results are reported in Fig. 6.

There was a significant effect of the personality of the human helpers on memory game performance, F(2, 27) = 10.93, p < .001, and \(\eta _{p}^{2}\) = .27. The post-hoc test showed that when participants played with the extroverted helper (M ± SD = 28.83 ± 4.79, p < .01) and the introverted helper (M ± SD = 31.17 ± 5.17, p < .05), the number of mistakes was significantly smaller than when playing with the non-social helper (M ± SD = 37.43 ± 11.02). The difference between the performance playing with the extroverted and introverted helper was not significant (p = .23), even though the extroverted helper provided higher levels of assistance (Lvl 2 and Lvl 4) than the introverted helper (see Table 2). The highest assistance level (Lvl 4), for instance, tells the player the location of the solution. It is also interesting to note that the extroverted helper appraised the player almost three times more than the introverted helper (Lvl 1). However, Lvl 1 did not provide any clues for the solution but only encouragement to players.

Table 2 Average number of assistance per level given by each helper
Fig. 7
figure 7

Number of mistakes made by participants when playing with a helper with a similar and an opposite personality, respectively. (* denotes .01 < p < .05)

4.5.1 Impact of the Personality Similarity Principle Between the Human Player and Human Helper on Game Performance

In this section, we explore how the personality similarity principle between participants and human helpers impacts on game performance (H2). At first, we aim to separate participants into the extroverted and introverted group and make sure that the extroverted group showed significantly higher extroversion than the introverted group. Thus, we defined participants (n = 8) whose scores ranked top 27\(\%\) on the extroversion sub-scale of the Big-Five Inventory as extroverted and participants (n = 8) who ranked bottom 27\(\%\) as introverted. The results are reported in Fig. 7.

Results of independent-samples t-test showed that the top 27\(\%\) participants (M ± SE = 28.75 ± .31) scored significantly higher than the bottom 27\(\%\) participants (M ± SE = 19.88 ± 0.48) with t(14) = 15.49 and p < .001.

With respect to the game performance, results of paired-samples t-test show that the 16 participants demonstrated significantly fewer mistakes when playing with the helper with different personality (M ± SE = 27.69 ± .94) compared to the helper with the same personality (M ± SE = 31.63 ± 1.45) with t(15) = 2.73 and p < .05.

4.6 Discussion

In this section, we discuss the results of the HHI experiment and whether the initial hypotheses H1 (participants can identify the helper’s personality) and H2 (participants with a similar personality to the helper achieve better performance than when they play with a robot with a different personality) described in Sect. 4.1 stand or fall. The results validate H1. Participants were able to distinguish between helpers with different personalities. This result contributes to addressing RQ1 as is shown in the next section. It is worth mentioning that, if this hypothesis had not been validated, we would not have been able to carry on the study to the next stage in which we replaced the human helper by a robot.

With respect to H2, our results showed the opposite conclusion. As discussed in the related work, the similarity and complementarity principle depends highly on the context of interaction and other personality traits not taken into consideration in this paper. Despite that, the results suggest that participants preferred more to interact with a helper of a different personality than a helper with similar personality traits in a memory game scenario.

Informal interviews with participants immediately after the experiment suggested that extroverted participants liked to play more with the introverted helper than the extroverted helper because of a lower number of interruptions or assistance. They mentioned that some assistance given by the extroverted helper would not necessarily lead them to conclude the game faster. In fact, those interruptions would break their concentration during the game which could make them forget locations of cards and achieve worse performance.

However, introverted participants stated that they enjoyed playing more with the extroverted helper than with the introverted helper. When asked the reason for their preference, they mentioned that the extroverted helper was more attentive and was able to better recognise when they needed assistance. Although we cannot draw any general conclusion from those interviews, our results suggest that the similarity and complementary principle depends on the kind of interactions that take place during the task [7]. In order to fully address RQ3, we further investigate the effect of this principle in the context of HRIs for an assistive memory game in the next section.

5 Modelling Helper’s Personality in a Robot

5.1 HHI Behaviour Annotation

Fig. 8
figure 8

The KT annotation tool customised for the assistance level labelling

For the annotation of helpers’ behaviours, we customised the Knowledge Technology (KT) annotation tool initially developed by the Knowledge Technology (KT) group [8] to annotate video samples in terms of emotion. The KT annotation tool is a modular web-based application based on DjangoFootnote 4 and PythonFootnote 5 for media content labelling.

After logging in the system, an annotator watches a video sample from four perspectives and annotates the level of assistance whenever it is given to the participant by pausing the video to get the timestamp and selecting one of the four levels of assistance (see Table 1) from a drop-down list as shown in Fig. 8. They were also asked to indicate when the game was started in the annotation tool so that the annotations could be later synchronised with the logs collected by the memory game. A text box was included to allow them to write comments relevant to player and helper’ behaviours. The annotations were saved in an integrated SQLite databaseFootnote 6 for a safe storage and post-processing.

Since levels of assistance have been objectively defined, only one annotation is necessary for each video sample. Together, the log files and annotations provided the necessary information to better understand factors that might have induced human helpers to assist a player, which was later used to model the proposed autonomous robot helper.

5.2 Helper’s Modelling in an Assistive Robot

Table 3 Robot verbal and non-verbal cues for the extroverted and introverted robot

The verbal and non-verbal social cues were analysed by making use of the annotation tool as well as of the behavioural analyses of the videos. We identified three verbal cues and three non-verbal cues as the most relevant features. With respect to the verbal cues (see Table 3), we decided to use the real helpers’ voices but tweaking loudness, speech rate and pitch to hide their identity. These audio features were shown to have an impact on the judgement of extroversion and introversion in robots [28]. Furthermore, [31, 32, 35] provided evidence that extroverted people speak louder with a wider vocabulary and a higher pitch compared to introverted people.

For the manipulation of the helpers’ voice, we used AudacityFootnote 7. It is a free and open-source digital audio editor. As additional manipulation to the generated synthetic voice, we tuned the pitch in order to have a distinctive child voice. This feature is important for keeping consistency with the NAO robotFootnote 8 which is perceived as a child due to its dimensions.

With respect to the non-verbal cues, we selected gestures and postures from video annotations that are considered relevant social cues in terms of extroversion and introversion in the literature [30, 34, 35, 49]. For instance, extroverted people display more body motion than introverted people in terms of number and amplitude [28]. The list of non-verbal cues is shown in Table 3.

For the manipulation of non-verbal features, we pre-recorded the movements of the robots using the NAOqi frameworkFootnote 9 and extensively tested them for safety validation before running the experiments. Three features were manipulated: the speed of the gestures, the amplitude of gestures, and gestures themselves. The latter cannot be reported in Table 3 but can be visualised in the video samples available in our repositoryFootnote 10. The scripts of the recorded gestures for the introverted and extroverted robot are also available in our repository and could be used by the robotics community in the context of assistive memory game.

Finally, the generated audio files and the pre-recorded movements for each modelled personality were synchronised and deployed in the robot. Each sentence and movement was selected in conformity with a specific state of the system. How and when the decision-making algorithm chose them will be explained in the next section.

It is important to note that even though the extracted features were not so different from those identified in previous work, we argue that an HHI study is crucial to verify whether or not those features are valid and whether others can be adopted due to the specific context in which the robot is employed.

5.3 Robot’s Assistance Behaviour

We investigated different variables that could foster human helpers to give assistance to a player including number of flips of each card, time to the last assistance, number of trials after the last assistance, time to the last success matching, number of trials after the last success matching, and time to the last flip, among others. The examination of the video data collected in the HHI experiment indicated that waiting times which have been relatively longer compared to the individual average time to perform the next flip, are evidence that the player was struggling to remember specific card locations. These longer waiting events often happened when a participant was trying to remember the location of the matching card from the previously flipped card. Behaviours like longer waiting time may indicate to the helper the need of assistance. After a careful empirical analysis, we defined a few variables that may correlate with the helpers’ decisions of providing assistance. These variables were used to build the decision-making algorithm.

5.3.1 Decision-making Algorithm for Assistance Generation

Fig. 9
figure 9

Probability distributions of the different levels of assistance provided by the robot in a given state. An assistance a at level l is drawn according to the conditional probability distributions p(a|sFSCCSF). The state s denotes the progress of the game (beginning, middle or end), the variable FSC denotes whether a player flipped a card, and finally, the variable CSF denotes whether the card has been flipped for the fist time

A statistical decision-making algorithm was developed based on the conditional probability distributions of providing assistance at level l in a given state s of the game. The state s is a categorical variable that describes the progress of the game and is defined as s = {beginning, middle, end}. When \(s = beginning\), a player has found less than or equal to 25% of the pairs. When \(s = middle\), a player has found more than 25% and less than 75% of the pairs. Lastly, when \(s = end\), a player has found more than or equal to 75% of the pairs. After each flip, the decision-making algorithm samples assistance a from the conditional probability distributions p(a|sFSCCSF) where FSC is a binary variable that indicates whether the player Flipped the Second Card in a trial and CSF is also a binary variable that indicates whether a Card is being Spotted for the First time. The probability distributions for each state of the game given the conditional variables are plotted in Fig.  9. The statistical decision-making algorithm is available at our git repository.

Note that differences in assistance behaviour can be seen between introverted and extroverted through the analysis of the plots. While the extroverted helper frequently encouraged the player after a successful matching (Lvl. 1 bars), the introverted helper tended to remain still. Also note that, at the beginning of the game (\(s = beginning\)), helpers tended to leave the players to explore the board by not providing much assistance at Lvl 2, 3 and 4. The assistance pattern changes in the middle of the game (\(s = middle\)) according to the players’ behaviour. In this state, players strove to avoid recurrent mistakes while following a mixed strategy of exploration and matching cards. Finally, at the end of the game, most of the cards were flipped and helpers were more prone to provide more assistance at Lvl 2 and 3. In this phase of the game, mistakes mean that the player did not remember the locations of cards he had already seen.

6 Human–Robot Interaction

The objectives of this second experiment are to (i) evaluate whether participants are able to distinguish the personality traits modelled in the robots, (ii) assess whether participants perceive when the robot is controlled by a human (WoZ) or it is running in the fully autonomous mode, and finally, (iii) evaluate whether participants achieve better performance when interacting with the robot helper with a similar or different personality.

In this experiment, we asked each participant to play the memory game four times. Once with the assistance of an extroverted robot controlled by the same extroverted helper from the HHI experiment, once with the assistance of an introverted robot controlled by the same introverted helper from the HHI experiment, once with the assistance of a fully autonomous extroverted robot, and finally with the assistance of a fully autonomous introverted robot. As in the HHI study, in order to avoid the order effect, we used the Latin square design to select the order in which the four sessions were carried out. Finally, to motivate participants and keep them engage during the study, they were told that the best player (the player who concluded the game with the least mistakes) would receive a prize.

The behaviour of the robot was modulated according to its personality and findings from the HHI experiment (see Table 3). During the game, the robot was able to display verbal and non-verbal social cues at three different game events: (i) before the user flipping a card, (ii) after the user flipping the first card in a trial, and right after the second flip (see Sect. 5.3.1). In the last case, the robot congratulated them if they succeeded or encouraged if they made a mistake.

It is worth mentioning that the role played by the non-social (neutral) helper was not critical in this second experiment. Rather than defining a new baseline to evaluate personality perception, we used the data collected in the HHI experiment as baseline. Another limiting factor that resulted in the decision of not modelling the neutral helper was the time of an experimental session. According to our pilot study, participants showed signs of boredom and distraction after playing the memory game for more than four times.

6.1 Hypotheses

We aim to assess the following hypotheses:

H3::

Participants are able to identify the robot’s personality (i.e., introverted or extroverted) after playing with it only once.

H4::

Participants are able to distinguish between a WoZ and a fully autonomous robot.

H5::

Participants with a complementary personality to the helper achieve better performance than when playing with the robot helper with a similar personality.

The first hypothesis H3 together with H1 will contribute to addressing RQ1. On the other hand, H4 supports our RQ2. Finally, H5 was formulated based on the results obtained in the HHI experiment and together with H2 will help us to address RQ3.

Fig. 10
figure 10

The HRI experimental set-up. Figure 10a and d show the experimental set-up from different perspectives, in particular in Fig. 10d we highlight (red square) the locations of the cameras. Figure 10b and e show the helper’s workstation for controlling the robot, whereas Fig. 10c and f show the view of the cameras located in front and on the side of the participant, respectively. (Color figure online)

6.2 Experimental Set-up

In the HRI sessions, the helpers, selected in the HHI experiment, would occasionally move in the lab to control their respective WoZ robots. To avoid giving any clues about the research questions investigated in our experiments and mitigate distractions, a three square meters room was built to completely isolate participants from outside events. Figure 10 depicts the experimental scenario. The touch-screen monitor was placed on the right with the assistive robot on top of it, close to the area where cards were displayed such that it could provide visual assistance to a player (e.g. pointing to rows and columns) while interacting with them. On the left side of the room, a laptop was placed for filling out questionnaires.

The experimenters would get into the room only in three occasions in a successful run. (i) Before the experiment taking place to explain the scenario, the phases of the experiment, and the warm-up trial designed for the participant to getting used with the game and the touch monitor. (ii) During the experiment to change the robots. (iii) At the end of the experiment for the conclusion of the session. Outside the room (Fig. 10b), the helpers could remotely control the robot by sending commands through the keyboard providing the participant with any level of assistance, of those available in Table 1. The helpers had access to the same information as in the previous experiment from two monitors, as shown in Figure 10b–e. On the left screen, the helpers could monitor players’ behaviours from a frontal camera. On the right, through a a screen mirroring the touch-screen monitor, they could track the current state of the game, and get access to the metadata information (solution and number of flips for each card).

As can be observed in Fig. 10, a NAO robot was employed for this experiment. In order to convey the impression to the participants that they were playing with two different robots, we used two robots, each of them played a different personality role. Having robots with distinct visual and vocal characteristics would leverage the human perception of their social aspects during interactions due to the embodiment factor. In fact, according to Wainer et al. [52], the presence of a physical robot in task-oriented interactions can influence a person’s perception of the robot’s capabilities and social attributes. In order to record audiovisual data to investigate the same set of verbal and non-verbal features from the HHI experiment, two Logitech C920 webcams were used. As shown in Fig. 10d by red squares, one camera was placed in front of the robot to capture the player’s facial expressions and gaze during the game (see Fig. 10c). The second camera was placed on the left side of the touch-screen monitor to record the player’s upper-body to analyse body movements and gestures (see Fig. 10f).

6.3 Questionnaires

In order to verify whether the personality traits with regards to extroversion and introversion were successfully modelled in the robots, we asked the participants to fill out the same BFI questionnaires adopted in the HHI experiment. Thus, in addition of filling out one questionnaire about themselves (i.e., “I see myself as someone who...”), they filled out one BFI after playing the memory game with every four robots to describe their perception of a robot’s personality traits. The results are compared with the data collected from the HHI and discussed in the following sessions.

Although the BFI questionnaire provides the data to verify the modelled personality traits, it does not provide all the information about the performance of the robot. Besides the score of the game, we adopted the Godspeed test [9] to measure the users’ perception of the robot based on five concepts such as anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety.

Anthropomorphism verifies not only the human perception of the similarities between the form and physical characteristics of the robots and humans but also their behaviours. Animacy evaluates the lifelike level of the robot. The higher is the animacy level, the higher is the probability of the robot being able to involve human emotionally in the interactions [22]. Likeability is usually related to audiovisual behaviour and influences the positive first impressions of a person or, in our case, a robot. Perceived intelligence evaluates whether the embedded artificial intelligence agent is able to generate behaviours that are consistent with human behaviours in the same condition. In the memory game scenario, suppose that there is only one card to be flipped. A human would not give any assistance, but an autonomous robot may decide to suggest to the human player flipping the last card if he made several mistakes when trying to find that pair, which could be perceived as less intelligent. Finally, perceived safety verifies the human perception of danger and comfort during the interaction.

Since the Godspeed test has become a standard measurement technique for HRI experiments, our results can be used by the robotics research community for comparison purposes and reproducibility. In our experiments, we use the complete Godspeed questionnaire. The list of items of the questionnaire to be answered in a semantic differential scale from one to five can be seen in Appendix B.

6.4 Results

6.4.1 Personality Test of the Robot Helpers

In this section, we evaluate whether participants were able to recognise the robot’s personality (H3). To do so, a 2 (personality: extroversion and introversion) \(\times \) 2 (autonomy: WoZ and autonomous) repeated measurements were performed on the sum scores of extroversion and agreeableness, respectively. The results are reported in Figure 11.

  1. 1.

    Extroversion The main effect of the personality of the robot helpers on extroversion was significant, F(2, 21) = 49.63, p < .001, and \(\eta _{p}^{2}\) = .68, indicating that the extroverted robot helper (M ± SE = 29.06 ± .64) got significantly higher scores than the introverted robot helper (M ± SE = 20.69 ± .97). The main effect of autonomy was not significant, F(2, 21) = 1.47, p = .24, and \(\eta _{p}^{2}\) = .06, showing that there was no significant difference between the WoZ (M ± SE = 24.38 ± .69) and autonomous robot (M ± SE = 25.38 ± .72). Besides that, the interaction effect between personality and autonomy was significant, F(2, 21) = 12.18, p < .01, and \(\eta _{p}^{2}\) = .35. The simple effect analysis showed for both levels of autonomy that extroverted robots (WoZ: M ± SE = 30.42 ± .91, autonomous: M ± SE = 27.71 ± 1.05) got significantly higher scores than the introverted robots (WoZ: M ± SE = 18.33 ± 1.14, autonomous: M ± SE = 23.04 ± 1.14) with p < .01. For the introverted robots, the autonomous robot got significantly higher scores than the WoZ robot, p < .01. For the extroverted robots, however, there was no significant difference between autonomous and WoZ robots, p = .08.

  2. 2.

    Agreeableness The main effect of the personality of the robot helpers on agreeableness was not significant, F(2, 21) = .03, p = .87, and \(\eta _{p}^{2}\) = .001. This is an expected result since specific features associated with the agreeableness trait were not investigated in this paper, hence not modelled in the robot helpers. There was no significant difference between the extroverted robot helper (M ± SE = 31.85 ± .96) and the introverted robot helper (M ± SE = 32.06 ± .90). The main effect of autonomy was not significant, F(2, 21) = .04, p = .85, and \(\eta _{p}^{2}\) = .002, showing that there was no significant difference between WoZ (M ± SE = 32.06 ± .79) and autonomous robots (M ± SE = 31.85 ± .96). Besides that, the interaction effect between personality and autonomy was not significant either, F(2, 21) = 2.05, p = .17, and \(\eta _{p}^{2}\) = .08.

Fig. 11
figure 11

Scores of extroversion and agreeableness of the BFI on robot helpers with different personalities and levels of autonomy. n.s. denotes p >.05, ** denotes .001 < p < .01, *** denotes p < .001

6.4.2 Memory Game Performance with Different Robots

Fig. 12
figure 12

Number of mistakes made by the participants when playing with the robot helpers with different levels of autonomy. * denotes .01 < p < .05

In this section, we assess the users’ performance when interacting with robots with different personalities as well as with different levels of autonomy. To do so, a 2 (personality: extroversion and introversion) \(\times \) 2 (autonomy: WoZ and autonomous) repeated measurements were performed on the game performance of the participants. The results are reported in Fig. 12. One sample was removed because his or her mistakes were above 3SD of the group.

The main effect of the personality of the robot helpers on performance was significant with F(2, 21) = 5.57, p < .05, and \(\eta _{p}^{2}\) = .20, indicating that participants made significantly fewer mistakes when playing with extroverted robots (M ± SE = 32.72 ± 1.12) than the introverted robots (M ± SE = 35.20 ± 1.38) even though the total average number of effective assistance (i.e., assistance that indicates the location of cards: Lvl 2, 3 and 4) provided by the introverted autonomous robot was higher than the extroverted robot, as shown in Table 4. The main effect of autonomy was not significant, F(2, 21) = .55, p = .47, and \(\eta _{p}^{2}\) = .03, showing that there was no significant difference between the WoZ robot (M ± SE = 33.52 ± 1.38) and the autonomous robot (M ± SE = 34.39 ± 1.18). Besides that, the interaction effect between personality and autonomy was not significant either, F(2, 21) = .16, p = .69, and \(\eta _{p}^{2}\) = .01.

Table 4 Average number of assistance per level given by each robot

6.4.3 Impact of Personality Similarity Principle Between Human Players and Robot Helpers on Game Performance

In this section, we aim to evaluate the impact of personality similarity principle between participants and robot helpers (H5). Extroverted and introverted participants were categorised using the same threshold as in the HHI experiment. While participants (n = 7) whose scores ranked the top 27\(\%\) in the extroversion sub-scale of the BFI were categorised as extroverted people, and participants (n = 7) who ranked the bottom 27\(\%\) were categorised as introverted people. Independent-sample t-test showed that the top 27\(\%\) participants (M ± SE = 31.00 ± 1.21) scored significantly higher than the bottom 27\(\%\) participants (M ± SE = 18.29 ± 1.51) with t(12) = 6.57 and p < .01.

Fig. 13
figure 13

Number of mistakes made by the participants when playing with a robot helper with a similar (sim) personality (WoZ or Autonomous, Auto) and a different (diff) personality (WoZ or Autonomous, Auto). ** denotes .001 < p < .01

To examine the impact of the personality similarity principle between humans and robots on the game performance, a 2 (personality similarity: same and different) \(\times \) 2 (autonomy: WoZ and autonomous) repeated measurements were performed on the game performance of the participants. The results are reported in Fig. 13.

The main effect of personality similarity was significant F(2, 21) = 18.37, p < .01, and \(\eta _{p}^{2}\) = .59, indicating that participants made significantly fewer mistakes when playing with robots with a similar personality (M ± SE = 32.82 ± 1.36) than robots with a different personality (M ± SE = 36.75 ± 1.70). The main effect of autonomy was not significant F(2, 21) = .03, p = .86, and \(\eta _{p}^{2}\) = .002. There was no significant interaction effect between those factors with F(2, 21) = .004, p = .95, and \(\eta _{p}^{2}\) = .001.

Fig. 14
figure 14

Godspeed questionnaire scores on anthropomorphism, animacy, likability, perceived intelligence, and perceived safety on robot helpers with different personalities (introversion and extroversion) and different levels of autonomy (WoZ and autonomous). n.s. denotes p >.05, * denotes .01 < p < .05, ** denotes .001 < p < .01, *** denotes p < .001

6.5 Godspeed Questionnaire

With the purpose of comparing the participants’ perception of robots with different personalities and with different levels of autonomy (H4), a 2 (personality: extroversion and introversion) \(\times \) 2 (autonomy: WoZ and autonomous) repeated measurements were performed by means of the five sub-scales within the Godspeed questionnaire. The results are reported in Fig. 14.

  1. 1.

    Anthropomorphism. The main effect of the personality of the robot helpers on performance was significant, F(2, 21) = 6.62, p < .05, and \(\eta _{p}^{2}\) = .22, suggesting that participants viewed extroverted robots more human-like (M ± SE = 2.75 ± .10) than the introverted robots (M ± SE = 2.54 ± .13). The main effect of autonomy was not significant, F(2, 21) = 2.09, p = .16, and \(\eta _{p}^{2}\) = .08. Finally, there was no significant interaction effect between personality and autonomy, F(2, 21) = .56, p = .46, and \(\eta _{p}^{2}\) = .02.

  2. 2.

    Animacy. The main effect of the personality of the robot helpers on performance was significant, F(2, 21) = 13.14, p < .01, and \(\eta _{p}^{2}\) = .36, suggesting that participants viewed extroverted robots more lifelike (M ± SE = 3.21 ± .10) than the introverted robots (M ± SE = 2.81 ± .14). The main effect of autonomy was significant, F(2, 21) = 18.12, p < .00, and \(\eta _{p}^{2}\) = .44, suggesting that participants viewed the autonomous robots (M ± SE = 3.17 ± .10) more lifelike than the WoZ robots (M ± SE = 2.86 ± .14). Finally, there was no significant interaction effect between personality and autonomy, F(2, 21) = 3.33, p = .08, and \(\eta _{p}^{2}\) = .13.

  3. 3.

    Likeability. The main effect of the personality of the robot helpers on performance was not significant, F(2, 21) = .50, p = .49, and \(\eta _{p}^{2}\) = .02. The main effect of autonomy was not significant with F(2, 21) = 2.49, p = .13, and \(\eta _{p}^{2}\) = .10. The interaction between personality and autonomy was not significant either, F(2, 21) = 3.01, p = .10, and \(\eta _{p}^{2}\) = .12.

  4. 4.

    Perceived Intelligence. The main effect of the personality of the robot helpers on performance was not significant, F(2, 21) = .04, p = .84, and \(\eta _{p}^{2}\) = .002. The main effect of autonomy was not significant, F(2, 21) = 3.03, p = .10, and \(\eta _{p}^{2}\) = .12. The interaction between personality and autonomy was not significant either, F(2, 21) = .71, p = .41, and \(\eta _{p}^{2}\) = .03.

  5. 5.

    Perceived Safety. The main effect of the personality of the robot helpers on performance was significant, F(2, 21) = 5.34, p < .03, and \(\eta _{p}^{2}\) = .19, suggesting that participants viewed introverted robots safer (M ± SE = 3.17 ± .10) than extroverted robots (M ± SE = 2.90 ± .13). The main effect of autonomy was significant, F(2, 21) = 6.32, p < .05, and \(\eta _{p}^{2}\) = .22, suggesting that participants viewed the WoZ robots (M ± SE = 3.15 ± .10) safer than the autonomous robots (M ± SE = 2.92 ± .12). The interaction effect between personality and autonomy was also significant, F(2, 21) = 18.62, p < .001, and \(\eta _{p}^{2}\) = .45. The simple effect analysis showed that participants viewed the introverted robot (M ± SE = 3.50 ± .56) safer than the extroverted robot (M ± SE = 2.79 ± .71) p < .001 only when the robots were controlled by a human helper (i.e., the WoZ robot). Besides that, participants viewed the introverted WoZ robots (M ± SE = 3.50 ± .12) safer than the introverted autonomous robots (M ± SE = 2.85 ± .14) p < .001.

6.6 Discussion

In this section, the results of the HRI experiment are discussed and the initial hypotheses defined in Sect. 6.1 are evaluated, which are H3) participants can identify the robot’s personality, H4) participants can distinguish between a WoZ and a fully autonomous robot, and finally, H5) participants with a complementary personality to the helper achieved better performance than when they played with a robot with a similar personality.

As in the HHI experiments, the participants were able to identify the different personality of the robot helper. Hence, our initial hypothesis H3 is valid. Our second hypothesis (H4) also stands since the participants were not able to recognise whether they were playing with the WoZ or autonomous robots regardless of the personality traits displayed by the robots. Our results support that the extroversion and introversion features from the HHI experiment were successfully modelled in the assistive robot helpers and the decision-making algorithm endowed robots to run in a fully autonomous manner.

With respect to H5, however, results are in contrast to our initial hypothesis. In the HRI experiment, participants had better performance with a robot that displayed similar personality traits as their own. This result, although opposite to the outcome of the HHI experiment, is in agreement with the result reported by Andrist [6]. As indicated in Sect. 2, researchers are still exploring this very complex aspect and results from previous studies are currently discordant. We envisage that humans when interacting with other humans behave differently than when they interact with robots in the same context. As reported by De Graaf [18], such disparities can be the result of norms and stereotypes that individuals apply to humans but not to robots. This last point might be the reason why the same kind of interactions provided by the helpers (human and robot) led to different outcomes.

In addition to the BFI questionnaire, we administered the Godspeed questionnaire to the participants to assess their perception of the robot. Results from the latter questionnaire show that participants perceived the extroverted robot (WoZ and fully autonomous) as more lifelike. This is expected since, in general, extroverted robots are more dynamic and active. They have a larger vocabulary and perform wider gestures than the introverted robot. Another interesting finding is that participants perceived the fully autonomous robot as more lifelike than the WoZ robot. This aspect may be related to the capability of fully autonomous robots to provide assistance in the most appropriate moments.

Perceived safety also presented a significant statistical difference between the introverted and extroverted robot helpers. Participants perceived the introverted robots as safer than extroverted robots. Moreover, participants perceived the introverted WoZ robot as safer than the introverted autonomous robot. The perception of the introverted robot as the safest robot can be attributed to their less expressive movements and slower speed.

7 General Discussion and Conclusions

In this work, we presented a human-like personality model based on HHI observations in the context of a memory game. We also developed a decision-making algorithm to empower assistive robots with the capability to provide different levels of assistance in a fully autonomous manner based on the state of the memory game. Within this framework, participants played the game obtaining support from an introverted or extroverted helper providing different levels of assistance.

Firstly, we conducted an HHI study to analyse the helpers’ behaviours in terms of relevant verbal and non-verbal social cues displayed during interactions, as well as to develop a decision-making algorithm for providing assistance according to the personality and the state of the memory game. Our results demonstrated that participants were able to recognise the two helpers’ personalities and, in the context of an assistive memory game, they had better performance when playing with the assistance of a robot helper with a different personality to their own.

Secondly, in order to address our first research question (RQ1), we conducted an HRI study in which we evaluated whether and to what extent distinctive verbal and non-verbal social cues of extroverted and introverted personality traits can be modulated in a robot. Our findings show that participants were able to identify the robot modelled with extroverted social cues and the robot modelled with introverted social cues.

Additionally, participants could not perceive any difference between the WoZ robot and the fully autonomous robot with the exception of the perceived safety of the introverted WoZ robot. We believe that this difference does not originate from the autonomous capability of the robot itself, but it is certainly correlated to the lower number of assistance triggered by the human helper sending commands to the robot. The same introverted human helper had different behaviour when providing assistance himself and controlling the robot as shown in Tables 2 and 4. With less assistance sent by the human, the robot remains more still. As a result, the still robot is perceived as a safer robot since it seldom moves. Therefore, we conclude that this result provides strong evidence in favour of RQ2.

Finally, we found out that the similarity and complementary principle depended on whether the helper was a human or a robot as the results in the two studies were in contradiction (RQ3).

In summary, the most relevant highlights of our research are:

  • We showed that certain social cues related to personality and observed from the HHI experiment can be successfully modelled in an assistive social robot in the context of a memory game.

  • We developed and evaluated a personality model on a robot that can autonomously provide assistance to humans in a memory game.

  • We demonstrated that different personalities were perceived by the participants in the HHIs and HRIs.

  • We showed that the similarity and complementary principle depended on whether the helper was a human or a robot.

  • We demonstrated that an extroverted robot is perceived more lifelike than an introvert while the latter is perceived safer than the extrovert.

The last two points deserve further discussion. With respect to the similarity/complementarity principle, we argue that the different results may be due to the norms and stereotypes that human beings have about their peers but not yet about robots. Concerning the reason why the extroverted robot was perceived more lifelike, we believe this was related to the wider range of movements it was able to convey. For the same reason, we believe the introverted robot was perceived safer as its movements were lower.

8 Limitations and Future Work

Personality is a very complex notion. As humans, we often struggle to identify and measure personality in people since it depends on several factors such as context, heredity, culture, and experience. Our robot with embedded human-like personality was able to provide assistance by using a limited input such as users’ performance during the first HHI study.

For a more effective HRI, we hypothesise that a more complex system that takes into consideration users’ facial expressions and postures should be designed. For instance, a confused facial expression after a flip could suggest the robot helper that the player needs assistance. With the advent of deep learning, automatic emotion perception system has shown significant progress on recognition performance and could be used as an integral part of the decision-making process of the robot [47]. Another limitation is that our framework does not have a dialogue system. In the HHI experiments, for example, participants verbally communicated with the human helper and, sometimes, explicitly requested assistance. Extending the robot’s behaviours and capabilities would certainly contribute to the perception of the robot as more socially intelligent and lifelike.

With respect to the defined and modelled social cues that have been proven to be effective from the questionnaires administered to the participants, we point out the following current limitations:

  • The hand-made process of extracting them is tedious and time-consuming, thus an automatic way to annotate specific user’s features would be worth to explore.

  • The features were extracted only from an extroverted and introverted helper, thus they were specific for that person profile. Further experiments could be conducted to investigate the behaviours of different extroverted and introverted helpers in the same context. However, we envisage personality as a unique characteristic in each human being, so personalisation beyond stereotyped personality is important especially for long-term interaction.

Along this line, as future work, we aim to evaluate separately the impact of verbal and non-verbal social cues and whether they contribute equally in the assessment of the robot’s personality. Another important aspect that is worth investigating is the effect of the robot platform. In our experiments, we adopted the NAO robot as robot helper. Although NAO is known to have very high acceptance, it presented several physical limitations including limited degrees of freedom in its arms. Hence, it is not possible to produce more complex social gestures with a NAO robot. Due to its static face, no facial expressions can be used for non-verbal social communication. For instance, a robot could display a happy face while encouraging the human player after a successful flip. Besides, future research may examine the gender effect during HHI and HRI. For instance, previous research showed that participants trusted more to robots of the opposite “gender” more and exhibited more pro-social behaviours towards it [46].

Moreover, we note that a memory game has some limitations for the proposed study. Since it is a cognitive exercise in which participants need to remember the cards’ locations, in some cases they were more focused on the game itself rather than on interacting with the helper. We speculate that a different game in which memory is not a primary concern might be investigated to foster users’ collaboration and interaction with the robot.

Finally, this work, as briefly mentioned in Sect. 1, is framed in the context of deploying a robotic system capable of furnishing tailored assistance to people affected by cognitive impairment while they are carrying out cognitive exercises [5]. These findings will contribute to extending the framework presented in [4]. Specifically, the Cognitive Assistive Robotic Framework (CARF) will be integrated with a personality module, which will offer the caregiver the possibility to set up, among all the preferences related to the specific user, e.g. mental and physical impairment and robot’s interaction modalities, also the robot personality that most suits the user. This will turn out to empower the robot with a wider range of possible behaviours that are obtained combining verbal and non-verbal social cues. We speculate that personality can contribute to enhancing the patient’s engagement and acceptance of the robot.