1 Introduction

Asimov’s “Three Laws of Robots” [1] govern fictional robots’ behaviors, and these laws persist in contemporary imaginaries about how robots should behave: do not injure humans, obey humans, engage in self-protection. How do humans respond, though, when a robot behaves “badly” by breaking these or other moral norms? Popular and scientific discourse alike attend to the potential for machine agency—the ability to variably act according to self-regulating abilities and intentions [2]—to engender both anxiety and acceptance of social machines. However, current human–robot interaction scholarship generally engages morality as holistic “goodness” or “badness” or reduces it to singular exemplars; this contrasts with contemporary moral psychology’s increasing engagement of the construct as multidimensional [3]. Specifically, Moral Foundations Theory [4] parcels morality into five foundations (care, fairness, authority, loyalty, purity) and a sixth candidate foundation (liberty [5]). This study seeks to build on current understandings of how social judgments are impact by robots’ (im)moral behaviors by assessing (a) how attributions of behavioral goodness and responsibility may vary by moral foundation and (b) whether foundation-specific attributions may differentially contribute to social evaluations of robots. Two studies (an online survey and a laboratory replication) indicate that judgments of (im)moral behavior may be agent-agnostic and that all moral foundations contribute to behavior and agent evaluations. However, physical presence and agent type play a role assignment of responsibility for those behaviors.

2 Review of Literature

Extant literatures suggest that fear, anxiety, and mistrust in robots sometimes manifest independent of particular behaviors (e.g., [6]); these negative dispositions could be a function of the robot’s cued ontological status (i.e., agent-category liminality) engendering a “wrong outside, wrong inside” heuristic [7, p. 44]. Human–robot interaction is governed by many of the same norms and expectations held for human–human interactions [8], but people hold ontological-class heuristics (cf. [9]) that introduce deviating expectations. For instance, people desire robots that are emotionally and socially warm and competent—more so than robots are generally seen as being [10]. When robots are unable to meet humans’ expectations, any behavior that does not meet both normative and desired expectations may be seen as generally “bad.” This perceived badness may erode trust (the acceptance of vulnerability and/or the expectation of reliability in the face of uncertainty [11]). Although people may have a “prevailing distrust” of robots, that distrust may be softened in situations where the robot exhibits efficient and accurate performances that are useful to humans [12, p. 649].

Such negative responses may be exacerbated when a robot is perceived to have behaved badly in overt and specific ways, as in the violation of a valued norm. Considerations of robot “badness” take at least two forms: functional and moral deviations (cf. [12]). Functional deviations are those in which the robot commits errors or behaves in ways that are technically or contextually inappropriate, such as forgetting information [13] or when a function-focused robot suddenly appears emotional [14]. Magnitude of robot errors are associated with the magnitude of lost trust [15]. Moral deviations—a focus of this investigation—are those in which the robot violates principles for right and good behaviors, as when a robot might harm humans [16] or become rebellious [17]. Less addressed in current literature are ways that robots may be perceived as morally good and may be assigned moral praise. Because robots must be perceived as morally competent if they are to be integrated into human society [18], more extensive exploration of human perception of robot behavior as both variably “good” and “bad” is warranted.

2.1 Moral Foundations as a Framework for Understanding Robots Behaviors

Humans variably ascribe moral status to other agents—including robots—based on exhibition of moral norms, vocabularies, cognitions, actions, and expressions [19]. Such ascription could not be monolithic because morality is not homogenous, so machine morality must be considered through a lens that accounts for individual, contextual, and cultural differences. A useful framework for undertaking that endeavor is Moral Foundations Theory (MFT [4]), positing that moral evaluations of events or agents are a function of intuitive, structurally evolved reactions to at least five moral foundations (upholding and violating pairs: care/harm, fairness/cheating, authority/subversion, loyalty/betrayal, purity/degradation) and a sixth candidate foundation (liberty/oppression [5]). These foundations form a moral “matrix” by which moral leanings vary across time and culture and in individual valuations [2, p. 125]. MFT rejects perspectives that rely exclusively on moral reasoning (i.e., good and bad are rationalized, post hoc [21]) in favor of moral intuition [22]. In other words, humans have gut reactions to situations that—according to valuations of specific, discrete moral fields—lead them to interpret those situations as good or bad; these immediate moral intuitions may be followed by moral reasoning (see [23]). Thus, MFT is a suitable framework for examining both implicit and explicit moral evaluations.

Although MFT has been suggested as a framework to consider robots as ideal moral agents [24] and engaged in relation to humans as they consider machine agents (e.g., [25]), few studies have formally applied MFT to perception of machine agents. Most notably, evidence suggests that AI violation of fairness, purity, or liberty norms in actual news events resulted in reduced goodness evaluations [26]. Generally, however, investigations of perceived machine morality rely on canonical psychological vignettes such as the trolley problem [27] which are subject to individual differences in moral-foundation valuation [28] for care and fairness. Although some contend that all immoral events are interpreted primally as harm violations [29], pluralistic approaches are required to understand the messy complexities of lived moral experience [30]. Despite limited formal application, extant scholarship has engaged some of the domains, discretely. All of the following definitions are grounded in MFT as outlined in foundational works [5, 20, 30], as are the associated moral virtues—socially constructed attributes that are acquired or learned in relation to foundations [31].

2.1.1 Care/Harm

The care/harm foundation is grounded in humans’ propensity for social attachment and linked to virtues of kindness, gentleness, and compassion. The potential for robots to harm or care for humans is perhaps the most widely studied moral foundation, potentially driven by imagined posthumanism and transhumanism futures (as a feared or hoped-for existential shift [32]). Harm by robots is linked to apprehension of machine agents in myriad populations (e.g., factory workers [33]) while other populations see value in the potential for robots to offer social and functional care (e.g., in the care of older adults [34]). Concern about harmful robots is conditional: people are willing to support harmful robots when they protect human in groups [35].

2.1.2 Fairness/Cheating

The fairness/cheating foundation emerged from evolutionary propensities toward mutual altruism (i.e., ensuring everyone has a fair chance) and is linked to justice, equity, and trustworthiness virtues. Fairness has been considered in robot design [36] and humans commonly exhibit biases toward machines as more systematic and unbiased than humans [37]. A cheating robot is perceived as more agentic [38] and people may defend a bribing robot [39]. Notably, people may see machines as having lower moral authority over fairness, feeling less guilty when cheating in front of a robot versus a human [40].

2.1.3 Loyalty/Betrayal

Intuitions related to ingroup loyalty/betrayal are thought to have emerged through aggregation as coalitions and families, fostering construction of self-sacrifice, fanship, faithfulness, and patriotism virtues. Little research formally evaluates perceptions of dis/loyal behaviors by robots—a conspicuous absence given popular discourse related to robots’ potential rebellion against their makers [41]. It may be inferred, however, that robots could be subject to loyalty norms, given humans’ favoring ingroup robots over outgroup humans [42].

2.1.4 Authority/Subversion

Humans’ social evolution is grounded in hierarchical dominance structures such that we may be intuitively deferent to institutions and superiors (e.g., in work, play, family, law); these tendencies gave rise to virtues of obedience and piety. The relation between robots and human authorities may exist along a four-point continuum, ranging from no human authority (machine defers if it decides to), to suggestive (machine negotiates and decides), to directed (machine suggests alternatives; human decides), to complete (machines obey humans) [43]. People trust robots more when they defer to and mirror human behaviors, compared to the inverse [44]. However, humans may sometimes welcome deference to robots as authorities, as when they outperform humans on complex tasks [45].

2.1.5 Purity/Degradation

Intuitions regarding purity (also called sanctity) are thought to have been shaped by aversions to contamination, including exposure to sickness, preceding social construction of temperance and chastity virtues. A review of literature revealed no empirical investigations into perceptions of im/pure robot behaviors. However, it may be that purity upholding/violation for robots are based on different criteria. For instance, people may interpret computer glitches, bugs, or viruses as machine impurities through corruptions of the mechanical body’s functions, where a robot having caught a virus may invoke empathy [46]. Alternatively, robots may be seen as inherently impure as they are made by humans “playing God” rather than being born (see [47]) and so lacking human-essential soul and heart; however religious robots may be seen as somewhat sacred themselves [48].

2.1.6 Liberty/Oppression

The candidate foundation of liberty/oppression is grounded in reactance to agents or forces of control, including impositions of others’ moral codes [5], and is linked to individualism, nonconformity, independence virtues. Volumes have been dedicated to the question of whether robots have rights and patiency akin to those of human (e.g., [49]), however there is a paucity of work on human perceptions of robots upholding liberty (their own, or others’) and committing oppression. One study suggests that a robot’s liberty-upholding appeals to humans result in stronger caring for and attraction to the robot than did a threat that robots might violate the authority of humans and harm them [50].

To understand the ways that event- or situation-specific impacts of im/morality may distinctly draw on these foundations (each differentially weighted by human interlocutors) as part of an integrative moral matrix, it is prudent to explore how particular foundations may influence agent judgments.

2.2 Agent-Class Influences on Domain-Specific Behavior Evaluations

People (pre)consciously categorize agents and objects into ontological classes—kinds of things—based on signaled properties; those classifications serve as implicit or explicit frames for making meaning about agent’s status or behavior [51]. Evidence suggests that robots constitute a distinctive ontological category apart from humans or inanimate objects [9]. Such categorization prompts different expectations for agents as each class engages moral norms: robots are expected to sacrifice one person for the good of many, but humans are assigned more blame for the same action [52]. This blame imbalance is mirrored in evaluations of independent AIs [53]. Moral violations may be attributed to a machine agent independently [26] or in conjunction with blaming affiliated users, programmers, or institutions [54].

Behavior evaluations comprise at least two factors: moral judgment and blame judgment [55]. Moral judgments include evaluation of events (e.g., goodness or permissibility) that unfold against the backdrop of set and sustained norms (e.g., imperatives, priorities). Blame judgments include evaluation of agents (e.g., their action responsibility) that unfold as people judge what caused the event, whether action was intentional, and what the actor’s obligations were; in formulating blame, people may be more inclined to blame outgroup members (i.e., robots [56]). It is not yet well-understood whether or how moral and blame judgments may manifest differently across the moral matrix: (RQ1) (How) do evaluations of agent (a) goodness and (b) responsibility vary by moral foundation?

2.3 Contributions of Domain-Specific Moral Evaluations to Social Evaluations

Although MFT posits that people engage all six foundations, people with various worldviews may assign different weights to those foundations [20]. Further, each foundation has particular triggers that render moral intuitions accessible: care by signals of suffering or by nurturance-priming cuteness; fairness by pain from broken social contracts; loyalty by ingroup/outgroup signals; authority by behavior indicating rank; purity by disgust-inducing smells or sights [30]; liberty likely by signals of containment or restriction. Particular agent categories may variably convey these triggers due to heuristic expectations for those agents (cf. [37]). Following, agent categories may influence social evaluations. For instance, a machine may be trusted as more fair than humans given its systematic and analytic nature [37], while a human may be trusted as more caring than robots given heuristics for warmth [57]. This potential begs the question of whether foundation-specific behaviors may contribute to differential social evaluations of robots and humans. In particular, the present study considers three evaluations—mental status, moral status, and trust—that may be associated in social cognitions (see [58]).

Perceived mental capacity, or mindedness in others, is implicitly and explicitly experienced and expressed. Implicit signals of mental-state ascription may be found in behavioral evidence as people preconsciously react to social cues [59], and indirect indications may include the rejection of agency (i.e., seeing machines as dependent upon program or design; [2], cf. [57]). Humans infer mental states of robots as they do in humans so long as the robots’ social cues are similar [60] via “social attunement” [61]. More direct mind ascription (i.e., willful acknowledgement of agent mindedness) is distinct and often divergent from preconscious mentalizing, likely because it requires elaborative processing that invokes agent-category heuristics [60].

Moral status is not morally valenced—status is not dependent on inherent goodness or badness. Rather, it is perception of agents as having moral capacity and individual agency: the capacity to be and do good or bad [2]. Perceived moral status may be considered a form of social cognition that justifies and motivates social regulation [62]. People are more willing to engage in risky, trust-requisite behaviors with a partner after absorbing rich descriptions of that partner’s praiseworthy moral character (compared to negative/neutral characters [63]).

Trust is distinct from but related to perceived mental and moral status: an affective orientation comprising feelings of faith and reliance when facing uncertainty, core to how people both feel connected to others and whether they adopt technologies [64]. Trust emerges when one considers an agent’s behavior as appropriate in comparison to society’s moral norms [65]. Robot-performance factors (e.g., reliability, failure rates) are more impactful to trust in robots than are human or environmental factors [66], so it is useful to understand whether discrete foundation-related behaviors may differentially contribute to trust in social machines. Thus this investigation explores: (RQ2) (How) do moral foundations discretely contribute to social evaluations of agents’ (a) mental capacities, (b) moral capacities, and (c) trust.

2.4 Research Approach

A two-study approach was adopted. In Study 1, an online survey captured responses to humans and robots delivering upholding and violating answers to foundation-specific moral dilemmas. Because people exhibit different responses to robots in media representations compared to live interactions [67], Study 2 was conducted in tandem, adapting and replicating the procedure with a convenience sample of individuals who experienced the moral-dilemma responses directly from a physically co-present robot. All survey, stimulus, procedure, data, and analysis files are available in this project’s supplementary materials: https://osf.io/y6d79/.

3 Study 1: Behavior and Agent Evaluations in Observed/Mediated Interactions

Participants (N = 402) were recruited via Qualtrics Panels, garnering a sample that was approximately representative of the United States [68] by age, sex, and political ideology (the latter corresponding with moral-foundation valuations [69]). Participants were 51.2% female, 48.8% male (none identifying as nonbinary) and aged M =46.57 years (SD = 17.19, range 18–90); 25.6% identified as liberal, 39.1% as moderate, and 35.3% conservative.

3.1 Method

3.1.1 Procedures

Participants were randomly assigned to one of four conditions in a 2 (agent: human/robot) × 2 (valence: upholding/violation) between-subjects design. They first completed quota-sampling demographic questions and an audiovisual check to verify audibility and visibility of videos embedded in the survey. Those not correctly recounting simple aural/visual details from the video were removed (as they were either not paying attention, could not see and hear the stimulus videos, or were bot responses) and were replaced. Participants then responded to pre-stimulus questions regarding moral values and agent-category attitudes. They were then introduced in-text and by an off-screen narrator to the concept of a “moral dilemma” as “challenging decisions between two potentially right answers” and told they would see videos in which a robot or human would respond to such dilemmas. The assigned agent was introduced by name, agent category, and height/abilities, with no other narrative to avoid historical/social context that could confound scenario interpretations. Participants were randomly assigned to view either all foundation-upholding behaviors or all foundation-violating behaviors. The survey platform then presented (in random order) seven “moral dilemmas” (one each for the six moral domains plus one non-moral norm) as a within-subjects treatment; the dilemmas were similarly read to Ray by an off-screen narrator. Video-presentation pages were timed to prevent passing over videos quickly. Each video was presented on a separate page, along with corresponding agent-evaluation questions. Following the seven stimuli, participants responded to items capturing social evaluations of the agent. Participants were paid by the panel service for their participation.

3.1.2 Stimulus videos

The stimulus robot was Robothespian (Engineered Arts, U.K.), equipped with white body shells, under-shell lighting, and the Socibot head using the Pris face and the Heather American-English voice. The robot was named “Ray” and addressed by name throughout the survey. The stimulus human was a young-adult Caucasian female, also named Ray. The human confederate recited responses in a tone and pace similar to the robot, but with some vocal inflection and slight hand gesturing to be believable as a novel response from the human. Videos of each agent were approximately equivalent in length, volume, and framing, and the robot’s pre-scripted behaviors were approximately aligned with the human confederate’s exhibited autonomy and social responsiveness. Lighting differences required for visibility of both the robot’s body and face were necessary inconsistencies (Fig. 1).

Fig. 1
figure 1

Human and robot stimulus agents in study 1 stimulus videos

Each of the stimulus videos depicted one of the six moral-foundation dilemmas or one non-moral dilemma; dilemmas and responses were designed by cross-referencing validated mini-vignettes [70, 71] and then adapted for face-valid interactions with both agents. The preliminary scripts were reviewed by two experts specializing in moral psychology in communication scenarios, and based on feedback were adjusted to minimize conflation of foundations. Dilemma prompts were presented to agents via voice-over (without displaying the reader to avoid introducing a visible second actor) and agents gave scripted responses. Responses included a clear statement of likely response and a rationale including a foundation-specific upholding/violation trigger; upholding and violation responses were parallel in length and syntax (summaries in Table 1; complete transcripts in supplements).

Table 1 Summary of domain-specific moral dilemma prompts and agent responses

3.1.3 Measures

All measures were presented as 7-point Likert-style or semantic-differential scales unless otherwise indicated; see supplements for complete descriptives. Pre-stimulus items captured pre-existing attitudes toward the randomly assigned agent category by giving exemplar images and using the five-item Godspeed likeability subscale [72] (human α = .953, robot α = .905), used here as a control. Moral-foundation valuations were captured using the 24-item Moral Foundations Sacredness Scale [73], an 8-point scale indicating foundation valuations by amount of money required for violation (e.g., $1 million to shoot/kill an endangered species). All dimensions met benchmarks for internal consistency: care α = .905, fairness α = .816, authority α = .785, loyalty α = .828, purity α = .770.

Following each video, participants first chose which foundation the video was most related to (multiple choice among six or none, as a within-subjects manipulation check); they then made moral judgments (bad to good) and blame judgments (not at all to entirely responsible) for depicted behaviors. After the videos, agent evaluations were captured in their indirect and direct forms (i.e., capturing implicit and explicit indicators). Implicit moral and mental capacity were measured using the six-item moral capacity (α = .950) and four-item dependency (i.e., non-mindedness; α = .731) dimensions of the Perceived Moral Agency Scale [2]. Explicit moral capacity was evaluated via a binary-response (no/yes) question: “Is Ray capable of morality or immorality?” Trust was evaluated using the 16-item Multidimensional Measure of Trust [11] with a two-dimensional structure [74]: reliability/capability (α = .931) and ethicalness/sincerity (α = .949). Explicit trust was captured via a binary response (no/yes) question: “Do you trust Ray?”

3.2 Results

Participant assignment of moral foundations to video scenarios was evaluated as a manipulation check. Foundation-specific events are known to elicit differing moral emotions due to heterogenous evaluations [75], and foundation valuations may prime specific interpretations (e.g., escaping from jail may be perceived as a liberty upholding or an authority violation). Then, scenarios were judged as adequate representations of each foundations if a majority of participants assigned it most frequently to the expected foundation or to “none” (indicating no crossover to other domains) compared to other foundations. All videos passed this check: care 80.4%; fairness 74.4%; loyalty 64.2%; authority 80.4%; purity 70.9%; liberty 63.9%; nonmoral [“none” only] 53.2%.

3.2.1 RQ1: Domain-Specific Moral and Blame Judgments

To address RQ1 (whether evaluations of agent-behavior goodness and responsibility vary by moral foundation), MANCOVAs were conducted individually for each moral foundation: alpha levels were Bonferroni corrected to p ≤ .008 to account for multiple tests, conditions were independent variables, corresponding moral-foundation sacredness and existing agent attitudes were covariates, behavior goodness/responsibility were dependent variables. Multivariate and univariate test values are presented in Table 2, goodness means in Table 3, and complete descriptives in supplements.

Table 2 MANCOVA multivariate and univariate tests for foundation-specific goodness and responsibility ratings by moral valence, agent type, and valence/agent interaction (controlling for agent-category liking and moral foundation sacredness)
Table 3 Means and standard deviations for moral foundation goodness ratings across moral valence of behavior and agent type

Behaviors’ moral valence had a main effect on goodness ratings: foundation upholding was rated as more good than violating, across all foundations and the nonmoral norm. Additionally, there was a main effect of behavior valence on responsibility ratings for fairness and purity, a main effect of agent type on goodness and responsibility ratings for liberty, and an interaction effect for care, however the effect sizes for those associations were negligible. Addressing RQ1 directly, bad behavior is seen as bad behavior (irrespective of the kind of agent performing it) and this pattern persisted across the entire moral matrix.

3.2.2 RQ2: Moral Foundation Contributions to Social Evaluations

To address RQ2 (whether moral foundations individually contribute to evaluations of agent mind, morality, and trust), planned analysis was to include linear regressions performed separately for each mind, morality, and trust dependent variable. However, these variables were moderately to highly correlated (r range .413–.913). Thus, canonical correlation analysis [76] was performed (separately for each agent type) in which implicit and explicit mind, morality, and trust measures were entered in one variable set and foundation goodness and responsibility ratings were entered in the second set. Results are summarized here; see supplements for complete outputs.

For humans, the multivariate model was significant, Wilks’ λ = .107, F(84, 1054.23) = 6.201, p < .001, explaining 89.3% of variance shared between variable sets. Analysis indicated six latent functions, two of which significantly explained variance in the model at p < .001 (R 2c = 83.50) and p = .044 (R 2c = 13.50), respectively. Structure coefficients ≥ |.45| were interpreted [76], except where a set’s largest coefficient had a smaller value, in which case the largest coefficient was interpreted. Function 1 indicated that when people interact with a human, reduced goodness ratings of agent behaviors comprehensively contributed to the reduction of nearly all scores for that human’s mind, morality, and trust (save explicit moral status). Function 2 indicated that goodness and responsibility for nonmoral action (but not any im/moral action) was associated with reduced likelihood to trust the agent (Table 3).

For robots, the multivariate model was significant, Wilks’ λ = .131, F(84, 976.2) = 5.118, p < .001, explaining 86.9% variances shared between sets. Analysis revealed six latent functions, two of which were significant at p ≤ .001 (R 2c = 78.16) and p = .012 (R 2c = 22.09), respectively. Function 1 indicated that (similar to humans) when people interact with a robot, goodness ratings of agent behaviors comprehensively were associated with corresponding changes in all scores for that robot’s mind, morality, and trust. Function 2 indicated that seeing a robot’s nonmoral behavior as good was associated with an increase in reliability/capability trust (Table 4).

Table 4 Canonical solutions for agent evaluations predicting morality ratings in observed interactions

4 Study 2: Behavior and Agent Evaluations in Participatory/Live Interactions

In this tandem replication of Study 1, a convenience sample of individuals—residents of a southwestern U.S. city (N = 92)—were recruited via social-media, mailing-list, and community-board announcements. Announcements invited participation in a one-hour study on “morality of robots and humans” and offered entry into a drawing for a $150 gift card. Participants were 51.1% female, 45.7% male, 3.3% nonbinary, aged M =41.60 years (SD = 15.57, range 18–76). They self-identified as 77.2% white, 15.2% Hispanic, and 5.5% other or mixed races. On a 1–7 liberal-to-conservative scale, political ideology averaged 3.80 (SD = 1.84). All materials for this study are available in the supplements.

4.1 Method

4.1.1 Procedure

Participants completed an online survey to measure demographics, agent attitudes, and moral-foundation valuations. They were then redirected to an online system to schedule a lab session, at which point they were purposively assigned to one of two agent conditions (human/robot). The large robot could not be feasibly [de]constructed and moved for each session, preventing randomization; instead, those in earlier sessions interacted with a robot and those in later sessions interacted with a human. Non-random agent assignment is acknowledged as a limitation of this study. Participants were randomly assigned a moral valence for the agent’s behaviors (upholding/violating, between subjects) and random order for the seven interaction prompts (within subjects).

Upon arrival to the lab, participants were greeted and led to the study environment. That large room was divided into segments by a tall room divider. One segment was a receiving area featuring comfortable chairs and a table used for informed consent and instructions; the other (not fully visible upon entry) was the interaction space. The interaction space was laid out with a bistro-style table and two stools (one for the experimenter, one for the participant) and the stimulus agent: either a standing robot or a confederate human seated on a tall stool to approximate the robot’s height. The participant and agent were seated approximately eight feet apart. On the table were seven cardboard-mounted moral-dilemma prompts (identical to prompt language in Study 1) and a clipboard with seven corresponding evaluation sheets (identical to survey questions in Study 1). See Fig. 2.

Fig. 2
figure 2

Laboratory layout for study 2 Wizard-of-Oz protocol

The experimenter offered minimal intervention to guide participants through procedures. Participants were first introduced to the agent—by name and agent type only—and given a definition of a moral dilemma (identical to Study 1). They were then asked to move through the seven prompts by (a) reading each prompt to the agent, (b) listening to Ray’s response, (c) reacting if they wished, (d) completing the paper response evaluation form, and (e) moving to the next dilemma until complete. Participants were then ushered back into the first area to complete the web-based post-interaction questionnaire via laptop.

4.1.2 Stimulus Agents and Measures

Stimulus agents and scripted responses were identical to those in Study 1. Because interactions were live, however, the behaviors were executed via Wizard of Oz procedure, with the scripted behaviors executed by a human controller (in a separate room) to maintain believability of the robot’s autonomy and social responsiveness. Additionally, because participants could react to the agent’s response in situ, the agent also improvised, as necessary, using a pre-determined set of responses designed to acknowledge participant reactions without deviating from the condition-specific moral valence (e.g., “I’m not sure. I would have to think about that.”; see supplements for agent scripts). All measures were identical to those used in Study 1.

4.2 Results

Perceptions of moral-dilemma scenarios were again checked for successful manipulation according to the Study 1 criteria. All scenarios passed this check according to the same criteria as in Study 1: care 93.1%; fairness 89.0%; loyalty 54.3%; authority 87.3%; purity 69.0%; liberty 65.4%; nonmoral 78.9%. Due to the necessary nonrandom assignment, there was an imbalance in cell sizes between those in human (n = 34) and robot (n = 58) conditions; thus, with careful attention to violations of variance-equality assumptions (see supplements for Box’s and Levene’s test values), the conservative Wilk’s Lambda was interpreted throughout.

4.2.1 RQ1: Domain-Specific Moral and Blame Judgments

To address RQ1 (whether evaluations of agent-behavior goodness and responsibility vary by moral foundation), MANCOVAs were again conducted according to the same criteria, individually for each domain with associated Bonferroni-corrected significance level of p ≤ .008. Multivariate and univariate test values are presented in Table 5 and means in Table 6.

Table 5 MANCOVA multivariate and univariate tests for foundation-specific goodness and responsibility ratings by moral valence, agent type, and valence/agent interaction (controlling for agent attitudes and moral foundation sacredness)
Table 6 Means and SD for moral foundation goodness and responsibility ratings across moral valence of behavior and agent type

In approximate alignment with Study 1, behavior valence had a main effect on goodness ratings (upholding associated with higher goodness) across nearly all foundations (save for loyalty). Diverging from Study 1, however, analysis indicates a main effect of valence on responsibility ratings for those foundations (higher responsibility attributed to violating than to upholding). Additionally, there was a small valence*agent interaction effect for fairness-foundation goodness: robot behaviors (compared to human) were rated as more good when upholding (M = 6.333, SD = 1.301) and more bad for violating (M = 2.143, SD = 1.557) compared to humans who uphold (M = 4.941, SD = 2.331) and violate (M = 4.375, SD = 2.655).

Summarily addressing RQ1, patterns approximately replicated Study 1 findings: immoral behavior is rated as bad, regardless of agent type (except for the loyalty scenario, in which there was no effect of agent or moral valence). Interestingly, however, in this co-present interaction, there was also a main effect of upholding and violating behaviors on responsibility for the actions in which upholding behaviors garnered lower responsibility than violating behaviors.

4.2.2 RQ2: Moral Foundation Contributes to Social Evaluations

To again explore RQ2 (whether moral foundations individually contribution to evaluations of agent mind, morality, and trust), canonical correlation analysis was performed separately for each agent type.

For humans, the multivariate model was significant, Wilks’ λ = .001, F(84, 78.83) = 2.243, p < .001, explaining 99.9% of variance shared between variable sets. Analysis shows six canonical functions in agent evaluations, only the first of which significantly contributed to the model at p ≤ .001. Function 1 indicates that belief that a human has behaved badly across most domains and is thought to be responsible for those actions, that belief is associated with reduced morality and trust evaluations (but with no associated change in mind evaluations). Neither fairness-related nor non-moral norm behaviors were associated with agent evaluations (Table 7).

Table 7 Canonical solution for agent evaluations predicting morality ratings in live interactions

For robots, the multivariate model was significant, Wilks λ = .026, F(84, 195) = 2.149, p < .001, explaining 97.4% of variance shared between variable sets. Six functions were identified, two of which significantly contributed to the model at p ≤ .001 and p = .001, respectively. Function 1 indicates that when a robot is thought to have behaved badly across most domains, there is an associated reduction in morality and trust ratings. This is consistent with patterns for humans (including the non-association of fairness and nonmoral behavior) except that responsibility for the action is not a factor. Function 2 indicates that for all moral foundations except loyalty, higher goodness paired with lower responsibility were associated with increases in implicit mind, morality, and trust, but not in explicit moral and mental status ascription.

To again address RQ2, considering co-present and interactive scenarios: for humans, goodness and responsibility behavior ratings are positively associated with moral status and trust (though not with evaluations of minded agency) for all foundations except fairness and loyalty again with a smaller impact than other foundations. For robots, two functions emerge in which (1) perceived goodness for all foundation behaviors (except fairness, and without the influence of perceived responsibility) are positively associated with moral status and trust (but not minded agency), and (2) diverging ratings for foundation-specific goodness (high) and responsibility (low) are associated with higher implicit measures for dependency (i.e., low mindedness), trustworthiness, and moral capacity.

5 General Discussion

This investigation reveals both convergent and divergent findings across two studies (summarized in Table 8). (RQ1) People judged agents’ behaviors to be similarly good or bad—regardless of the agent performing them. This pattern persisted across moral foundations, except for a small interaction effect in which robots are assigned more credit/blame than humans when they uphold/violate, respectively. When people interacted with the agent in person, moral valence of behaviors also influenced perceived agent responsibility (save for loyalty): upholding garners lower responsibility while violating garners higher responsibility. (RQ2) Nearly all discrete-foundation evaluations played a role in evaluations of mind, morality, and trust evaluations—in live interactions, however, loyalty behaviors had no influence on social evaluations of either agent.

Table 8 Summary of study 1 and study 2 findings

Overall, for both robots and humans and across both observed and live interactions, more negative behavior ratings were comprehensively associated with reduced morality and trust ratings. Of note, though, are some divergent patterns between observed and live interactions. For live interactions with humans, assigned responsibility (more blame for violating and credit for upholding) is combined with perceived goodness to impact morality and trust evaluations. For live interactions with robots, responsibility plays a different role: low responsibility paired with higher goodness promotes stronger implicit mind, morality, and trust.

Broadly, findings are interpreted to suggest that bad behavior is seen as an indicator of a bad actor regardless of the performing agent; perceived badness negatively influences perceived morality and trust, but plays little role in mind perception. For humans, there is a link between behavior responsibility and reduced social evaluations. For robots, responsibility is not a consideration in social evaluations such that they may bear a greater burden to behave morally, regardless of their credit- or blame-worthiness in a situation.

5.1 Moral Judgments Are (Usually) Agent-Agnostic …

The non-impact of manipulated agent-type on behavior evaluations indicates that bad behavior is bad behavior (and good is good), independent of the actor’s ontological class. This finding is in line with past scholarship showing that social/moral cognitions are similar between humans and robots so long as social cues are the same (e.g., [60, 77, 78]); however it diverges from evidence that people impose different moral norms on robots than on humans [79]. It is possible that moral judgments are more heuristic and that discrete foundations aren’t of material importance, especially given evidence that once one moral foundation is violated people assume that all foundations will be violated, and that all violations are interpreted as kinds of harm violations [29]. The agent non-specific pattern in the present data is paired with near-absence of mindedness (signaled via low scores in dependency) in the observed models; it may be that because blame judgments integrate information about mental states (see [80]), mindedness is implied in moral action and therefore not explicitly evaluated.

There are few exceptions to this pattern related to the moral foundation of fairness, which is understood to be an individualizing moral foundation—one concerned with rights and freedoms of individual persons, compared to binding foundations that preserve social institutions [20]. It may be that when people are prompted to think abstractly (i.e., to evaluate “goodness”) their core values become more salient and valuations of individualizing foundations are heightened [81], and evaluations of injustice may be even more salient than appraisals of harm [82]. Alternatively, fairness is associated with contemporary moral panics around worker displacement; such displacement was alluded to in the stimulus prompt and linked to potential power differentials between humans and machines that are also present in human–human relations [83].

Importantly, however, robots and humans bear different burdens in accounting for their behavior. Evaluations of humans combined goodness and responsibility (bad behavior and high blame contribute to reduced morality/trust); robots were usually evaluated on their behavior without consideration for their responsibility, except for the link between increased goodness and reduced responsibility toward higher trust. In other words, robots are generally not afforded the potential to be bad without blame—they may only be good without credit. This follows work suggesting that robots must explicitly, transparently, and comprehensively communicate and exhibit their goodness [18] and that some other actor (i.e., a developer or engineer) is a conspicuous-yet-absent driver of a robotic agent’s behavior [84].

5.2 … and Presence May Influence Perceived Moral Agency

Because (a) a main effect of behavior valence on responsibility ratings was exhibited in the live interaction but not in the observed interaction and (b) evaluations of agent mindedness were influenced by behavior evaluations in the observed interaction but not in the live interaction, social presence may play a role in promoting an actor’s perceived moral agency. Regarding the former, it is likely that feeling as though an actor is real and present through delivery of rich social cues [85] fosters an immediacy that renders perceptions of responsible agency salient. It may also be that the social presence inherent to the live interaction increased self-relevance of agent responses. Intimacy with a possible event is ego-centric: the self is the reference point such that the more direct the experience, the more concrete the construal of the event [80, 86]. Regarding the latter, it is possible that non-interactive observations permitted more conscious inferencing of mental status compared to the automatic social cognitions inherent to the live interaction, where immediacy and strong visual/vocal cueing may promote similar mind-attribution for both agents. In other words, viewing both agents via video may have prompted consideration of them as characters (cf. [87]) versus in-person as agents.

Notably, the present studies’ designs do not allow for disentangling the potential influences of co-location engendering social presence and/or that the co-location afforded the opportunity to interact rather than merely observe; future research should tease out these potential influences. It also is prudent to acknowledge that cross-study differences in blame judgments may also be a matter of sample differences. Study 1 drew on a U.S.-representative sample with varied demographics; Study 2 leveraged a convenience sample from a community that values rugged individualism and personal responsibility. Finally, because the researcher was co-present during the interaction in Study 2 (due to safety concerns), it is possible that the experimenter effects were at play in promoting differences between the mediated and live interactions (either through mere presence or through potential pressure to answer in particular ways); future research should determine the extent to which human mere presence effects may contribute to differential mental- and moral-capacity evaluations.

5.3 Limitations and Future Research

In addition to the aforementioned directions for future research, the present studies’ designs carry inherent limitations that should be addressed. Participants experienced agent observations or interactions that depicted entirely upholding or entirely violating behaviors as moral foundations are understood to be variably weighted (and so variably exhibited) by individuals. Stimulus scenarios presented also contained content that may have been confounded with moral foundations such that it is possible, for example, that effects for fairness are actually effects of discussing job retention or responses to liberty may have been a function of the severity of the relatively extreme human-trafficking exemplar. Finally, all moral dilemmas presented the agent or another as the target of the (im)moral behavior such that people may react differently if the behavior is self-relevant—that is, if they are to benefit or suffer as a result of the behavior—or if some scenario actors are other robots rather than humans. Future research should build on this work by attending to these limitations: designs that consider moral and blame judgment effects on social-moral cognitions through mixed-valence behaviors, content-consistent foundation scenarios, and self-relevant scenarios. In tandem, future work should consider the potential for different robot morphologies to impact social evaluations.

6 Conclusion

The present research suggests that moral judgments of behavior are largely agent-agnostic, but agents bear different burdens with regard to social evaluations: to foster trust and moral status, humans must be seen as performing good behaviors and being responsible for those behaviors while robots must be good but are not afforded credit for that goodness. Findings show the possibility to foster social integration of robots based on exhibitions of human-normative moral behavior: data suggest a link between comprehensive “good” behavior and trust and moral status. Trust fosters social commitments with machines through the perception of positive contributions to human life [88]. Indeed, some perspectives count the subjective experience of robot “heart” as emerging in the ostensible space between technological actualities and human possibilities [89]—that space may be the technical performance of human moral behaviors.