Introduction

As an academic topic, legal psychologists have studied deception from a variety of angles as deception and its detection are relevant in many (if not all) phases of the judicial process (Granhag & Strömwall, 2004). As a function of disastrous and highly publicized terror attacks in the Western world, questions about reliability and credibility have also aroused the attention of national security experts from various domains (e.g., Ormerod & Dando, 2015). The present work documents our investigation of proximal (i.e., immediate) effects of existential threat on the process of lie detection. Specifically, we hypothesized that the ability to accurately classify true and false messages will be higher in a mortality salience (MS) condition compared with the control condition. We present three studies addressing this hypothesis. None of the studies provided any significant MS effect on lie detection accuracy. However, these null findings should not be overstated. Instead, the present contribution aims to reveal the theoretical and methodological challenges in properly testing proximal MS effects on lie detection accuracy. Thus, this work aims to be informative for conducting improved future research rather than to provide conclusive evidence against or in favor of the investigated idea.

Research on Lie Detection Accuracy

Although lying and accurately detecting lies have always been important social issues (e.g., Ekman, 1992), people’s ability to discriminate accurately between lies and truths is not particularly well developed. A comprehensive meta-analysis of more than 200 studies revealed that individuals achieved an accuracy rate of about 54% (Bond & DePaulo, 2006; for similar results, see Hartwig & Bond, 2011). Two factors are likely affecting these low accuracy rates. On one hand, senders leak few actual cues of deception (Hartwig & Bond, 2011). On the other hand, many cues that laypeople consider relevant are of little diagnostic value (e.g., DePaulo et al., 2003; Global Deception Research Team, 2006; O’Sullivan, 2003); most of these believed false cues refer to nonverbal behavior. Therefore, a focus on verbal cues, for example on the logical structure and plausibility, results in a better lie detection accuracy (e.g., Bond & DePaulo, 2006).

By referring to dual-process models of persuasion highlighting the role of motivation and resources in message processing (e.g., Chen & Chaiken, 1999), recent research on deception detection revealed that factors assuming to be associated with systematic information processing were found to increase classification accuracy, basically due to an intensified use of content-related verbal cues instead of relying on more stereotype-based non-verbal cues. Reinhard (2010), for example, provided empirical evidence that Need for Cognition (i.e., the tendency to think carefully about new information) is positively associated with the use of verbal cues (Studies 1 and 2), as well as classification accuracy (Studies 3 and 4). Other works showed that accuracy rates increased when participants were familiar with the situation (Reinhard et al., 2011) or after unconscious thinking (Reinhard et al., 2013). Especially important for the present work is a series of experiments that showed increased classification accuracy for participants who were in a state of negative affect (Reinhard & Schwarz, 2012). All studies cited here have in common that they assumed the factor of interest (i.e., need for cognition, familiarity with the situation, unconscious thinking, negative affect) to be linked to more systematic information processing, leading to an intensified use of more diagnostic verbal instead of nonverbal cues and to an increased accuracy in detecting lies.

Proximal Effects of Existential Threat

Effects of existential threat have been investigated in the frame of Terror Management Theory (TMT; Greenberg et al., 1986), specifically addressing the role of death as a key factor for people’s need for self-esteem and people’s motivation to uphold and fight for their worldviews (Pyszczynski et al., 2015). TMT distinguishes between proximal (i.e., immediate) and distal (i.e., delayed) reactions when being confronted with one’s mortality (Arndt et al., 2002; Pyszczynski et al., 1999), and most research addresses the latter reaction. Distal defenses are assumed to occur when death thoughts are outside of focal attention and refer to unconscious self-esteem and worldview defenses that suppress anxiety by providing a sense of symbolic or literal immortality. Such experimental studies typically include one or several distraction tasks after the mortality salience (MS) manipulation to ensure a temporal delay (Burke et al., 2010). Proximal defenses are characterized by logical, rational comprehensible, threat-focused efforts to push the problem of death into the future or by entirely suppressing it, for example, by denying one’s vulnerability or exaggerating one’s health hardiness (e.g., Goldenberg & Arndt, 2008; Greenberg et al., 2000; Pyszczynski et al., 1999). Nevertheless, empirical knowledge about specific proximal reactions remains vague, aligning with the statement of Pyszczynski et al. (2015) that reads, “further research on the role of affect and arousal in MS effects is surely warranted” (p. 20).

To derive possible effects of existential threat on lie detection accuracy, and in line with the existing literature, we identified two potential mechanisms: heightened attentional vigilance and negative affect. We want to make transparent that we initiated this line of research by focusing on attentional vigilance and developed and addressed the idea for negative affect only after the first two studies failed to show any significant effects.

Heightened Attentional Vigilance

Jonas et al. (2014) proposed a general process model of threat and defense, providing a synthesis of different theories on threat. They defined threat as the experience of a discrepancy between a desire or expectation and actual circumstances. By referring to previous research on social exclusion (DeWall et al., 2009; Gardner et al., 2000) and meaning-threats (Proulx & Heine, 2009; Randles et al., 2011), the model posits heightened attentional vigilance as one proximal reaction to the experience of a threat. Heightened attentional vigilance is described as increased selective sensitivity to certain cues that can provide order and structure in a given situation (Jonas et al., 2014). It is predicted that after the perception of a threat, cues that help counteract the presented threat are processed more deeply, thus proposing that threats are beneficial for systematic information processing of certain information (Pittman, 1998). However, little is known about the specific information that is processed more deeply. Regarding lie detection, Eck et al. (2020) found that the experience of social exclusion increases the ability to accurately discriminate between truth and lies. They argued that ostracism fosters the careful processing of affiliation-relevant cues, because being excluded increases the need of an accurate impression formation of the environment to enhance the chances of finding appropriate affiliation partners. In addition, findings of two studies provide evidence for the idea that social exclusion makes people better in correctly categorizing a target person’s smile as real or fake (Bernstein et al., 2008; Schindler & Trede, 2021). In line with these works, it seems plausible to assume that threat does not equally increase attention toward all information in the environment but especially to those stimuli that are related to the threat and relevant for the threatened need.

Following this reasoning, one could assume that after MS, death-related content is given more attention; in particular, messages that deal with death should be processed more systematically and more accurately classified as lies or truths. However, proximal reactions are also claimed to involve “efforts to suppress or distract and distance oneself from identified anxious thoughts and circumstances” (Jonas et al., 2014, p. 230). This would speak against increased motivation in systematic processing of death-related content and more to an increased aversion to dealing with it. Another possibility could be that being confronted with potential lies automatically constitutes a potential violation of one of the most important aspects of most people’s worldviews, namely honesty. In other words, in a situation where potential liars should be detected, honesty is threatened. Given that TMT posits MS to lead people to defend and bolster their worldviews, such as honesty (Schindler et al., 2019), it follows that in a situation where potential liars should be detected, MS increases vigilance for deceptive cues independent of the specific content of the lie. From this perspective, MS should increase the ability to correctly classify true and false messages. It should be noted that according to TMT, worldview-related reactions after MS are typically understood as distal reactions, not as proximal reactions. In fact, first evidence already showed MS to affect veracity judgments (Schindler & Reinhard, 2015a, b); however, these works referred to distal reactions. Nevertheless, MS-induced vigilance might work as a mechanism that increases lie detection accuracy. We started our investigations by focusing on this perspective.

Negative Affect

In classical TMT research, no effect of MS on negative affect is documented. However, specifically addressing the role of affect in TMT research, recent work questioned this affect-free claim. In a series of different experiments, Lambert et al. (2014) showed that participants reported increased negative affect (and especially subjective experienced fear) as an immediate response to MS. Lambert et al. argued that the majority of TMT studies compared MS with aversive control conditions (e.g., dental pain), so that a fair test for changes in experienced negative affect is not provided. Additionally, they argue that MS studies focused on overall negative affect, but affect should be assessed more specifically. In line with this reasoning, Harmon-Jones et al., (2016, Study 3) found that the typical MS manipulation (compared with a non-aversive control condition) immediately led to increased anxiety, sadness, and fear. Taken together, there are good theoretical and empirical arguments for why MS should immediately induce negative affect such as fear or sadness. Given the evidence that negative affect increased classification accuracy (Reinhard & Schwarz, 2012), MS can be assumed to increase classification accuracy of lies and truths.

The Present Research

While several works have focused on distal effects of MS on veracity judgments (Schindler & Reinhard, 2015a, b), no work thus far has addressed proximal effects of MS on veracity judgments, in particular classification accuracy. This work specifically addresses the idea that mortality salience threat increases the ability to accurately classify true and false messages.

Based on the available literature, we identified two potential mechanisms that could improve the ability to lie detection accuracy after MS: heightened attentional vigilance and negative affect. Both states can be assumed to lead to more systematic information processing that in turn promotes the ability to successfully distinguish between false and true messages––basically due to an intensified use of more diagnostic verbal instead of nonverbal cues (e.g., Reinhard, 2010; Reinhard et al., 2011, 2013). At the beginning of this research, we focused on vigilance and addressed this mechanism by including items in Studies 1 and 2 to capture this process. Due to knowledge gained during the research process, in Study 3 we focused only on affect as the central mechanism and included items on the actual affective state; vigilance as a possible mechanism was no longer addressed.

Data and the material for all studies (except the video material) are available on the Open Science Framework (OSF; see: https://osf.io/vmdcg). Study 3 was preregistered (https://aspredicted.org/ea4j6.pdf), while Studies 1 and 2 were not.

Study 1

Method

Subjects and Design

Study 1 was conducted in the lab and was not preregistered. Recruiting took place on campus at a German University. The required sample size was computed using G*Power 3.1 (Faul et al., 2009). Results of a meta-analysis on MS effects (Burke et al., 2010) revealed a medium to large effect size of f = 0.37. However, Yen and Cheng (2013) suggest that the effect size might not be quite that large, especially for those beyond the core group of TMT researchers. Taking this into account, we assumed a more conservative effect size of f = 0.25. Type I error rate was set at p < .05 and power level to 80% (Cohen, 1988). An a priori power analysis for an ANOVA (with fixed effects, omnibus, one-way; number of groups = 2) revealed 128 participants. The initial sample included 138 students. A total of 18 data were excluded because participants reported technical problems with watching the videos during the study. The final sample included 120 students (59.2% female, 39.2% male, 0.8% divers) aged 18 to 42 (M = 23.13, SD = 4.52). Participants were randomly assigned to experimental between-subjects conditions (MS vs. dental pain control condition).

Procedure and Measures

All participants were seated in front of a computer and started by reading general instructions that explained the processes of the experiment and requested they put on the enclosed headphones. Next, they received the typical MS (or dental pain control) induction, consisting of two open-ended, short-answer questions. In the MS condition, participants were asked to write about the emotions that the thought of their own death arouses in them (“Please briefly describe the emotions the thought of your own death arouses in you.”) and to jot down what they think would happen to them as they physically die (“Jot down, as specifically as you can, what you think will happen to you as you physically die and once you are physically dead.”). This manipulation has been successfully applied in many experiments in TMT research (Burke et al., 2010). Parallel to previous research, participants in the control condition answered the same questions regarding dental pain (“Please briefly describe the emotions the thought of dental pain arouses in you.”; “Jot down, as specifically as you can, what you think will physically happen to you as you experience dental pain.”).

Next, all participants were instructed to watch ten videos in which ten persons were interrogated and suspected of having stolen 20 Euros from another person’s wallet (see below). Participants were also told that some of the messages were false (i.e., the interviewed person denied the theft but is actually guilty), and that other messages were true (i.e., the interviewed person denied the theft and is actually not guilty). After each video, participants had to judge the truthfulness of each message on a binary scale (false vs. true). Additionally, on a scale ranging from 50% (guessing) to 100% (absolutely sure), they indicated how certain they were about each of their veracity judgments (Reinhard et al., 2013). To increase reliability when measuring the ability of lie detection accuracy, participants were randomly assigned to one of two different sets of messages (cf. Reinhard et al., 2011, Experiment 1).

We additionally measured several self-report variables. First, we assessed to what extent participants used verbal (“I have used verbal information [such as the conclusiveness of the narrative] for my judgments”) and nonverbal cues for their judgments (“I have used non-verbal information [such as body language, facial expressions, or appearance] for my judgments”), as well as how attentive, vigilant, motivated, and careful they had been during the lie detection task. We further assessed their experience regarding lie detection and their general well-being. Self-reported variables were assessed with one item each. Then, participants were asked if there were any technical problems while watching the videos. Finally, we assessed demographic variables (i.e., sex, age, native language, field of study), asked them to guess the study topic, and provided them the opportunity to give feedback. After participation, they received five Euro or course credit.

Video Material

In this study, we used the true and deceptive messages created by Reinhard et al., (2011, Experiment 1). Twenty male students from a German University were recruited for a study on “communication and small talk” and randomly assigned to the truth or lie condition. In the truth condition, targets were introduced to their game partner (actually a confederate) and were instructed to play Backgammon together (all targets were familiar with the rules). Then, the experimenter left the room. During the game, there were three interruptions: First, the experimenter entered the room and asked if everything was working well; second, the confederate left the room because he received a phone call; third, another confederate entered the room, searched for her wallet, and when she found it claimed that 20 Euros were taken from it. Following that, targets were accompanied to another room where the interview about the theft took place. So far, they had already received 10 Euros for their participation but were offered an additional 20 Euros if able to convince the interviewer of their innocence. Targets in the lie condition were instructed to take the money from the wallet and to deny the theft during the interview. They did not played Backgammon with a partner. To keep conditions constant, targets in the lie condition were given the same background information so that they were familiar with the event they should describe without having experienced it. During the interview, all targets were asked the same questions (e.g., “You are suspected of having taken 20 Euros from the woman’s wallet. Have you taken the money from the wallet?”). The final video material consisted of 20 videos containing ten actual lies and ten actual truths. The average duration of an interview was approximately 2 min and 30 s. According to Reinhard et al. (2011), we used two sets, each consisting of ten videos.

Results

Messages Judged as True

Participants classified 59.08% (SD = 15.98) of the messages as true. This value is significantly different from 50%, t(119) = 6.23, p < .001, indicating a truth bias. Messages judged as true were not significantly affected by the MS manipulation, F(1, 118) = 0.08, p = .785.

Classification Accuracy

The overall accuracy rate was 49.42% (SD = 15.57), which did not significantly differ from 50%, %, t(119) = -0.41, p = .682. To test our hypothesis, we conducted a 2 × 2 mixed-model ANOVA, using classification accuracy as dependent variable, MS (vs. dental pain) as between-subjects factor, and truth status of messages (true vs. false) as within-participants factor. Given that our hypothesis refers to the ability to accurately discriminate between true and false messages, we expected a main effect of MS but no interaction effect between MS and truth status of message. Classification accuracies for all groups are displayed in Table 1. Results revealed no significant main effect of MS, F(1, 118) = 2.15, p = .145, η2p = .02. Classification accuracy was not significantly lower than 50% in the MS condition, t(60) = -1.31, p = .194, or in the dental pain control condition, t(58) = 0.76, p = .450.

Table 1 Means and standard deviations of accuracy of lie/truth classifications (in %) as a function of MS in Study 1

Overall, and corresponding to the truth bias, participants correctly classified more true messages as true (M = 58.50, SD = 24.35) than they correctly classified false messages as false (M = 40.33, SD = 20.08), F(1, 119) = 38.76, p < .001, η2p = .25. The interaction between MS and truth status was not significant, F(1, 118) = 0.08 p = .785, η2p = .00. Messages judged as true did significantly correlate with classification accuracy, r = 0.19, p = .037. Controlling the ANOVA for video set did not change results in terms of levels of significance.

Further Analyses

The mean judgment certainty was 75.26% (SD = 8.68). Judgment certainty did significantly correlate with classification accuracy, r = 0.25, p = .007, and was not significantly affected by the MS manipulation, F(1, 118) = 0.92, p = .341. Furthermore, MS had no significant effect on the used cues (verbal cues: F[1, 118] = 0.55, p = .462; non-verbal cues: F[1, 118] = 0.00, p = .999) and no significant effect on participants’ self-reported state while watching the videos (attentive: F[1, 118] = 0.68, p = .413; vigilant: F[1, 118] = 0.05, p = .824; motivated: F[1, 118] = 0.00, p = .959; careful: F[1, 118] = 0.11, p = .740). Moreover, classification accuracy was not significantly correlated with any of the self-reported measures (verbal cues: r = 0.03, p = .732; non-verbal cues: r = -0.06, p = .533; attentive: r = 0.02, p = .796; vigilant: r = -0.01, p = .956; motivated: r = 0.10, p = .283; careful: r = -0.01, p = .893; experience: r = -0.00, p = .962; well-being: r = -0.05, p = .578).

Discussion

Study 1 addressed the idea that MS proximally increases ability to correctly classify true and false messages. Results showed no significant effect of MS. We also found no effect of MS on self-reported use of verbal and non-verbal cues and self-reported state while watching the videos (i.e., attentive, vigilant, motivated, careful).

In this study, we included dental pain as a typical control group in TMT research (e.g., Burke et al., 2010) to enable to draw conclusion about the unique effect of MS beyond other negative experiences. However, the general process model of threat and defense (Jonas et al., 2014) posits that any kind of a threat potentially triggers a state of heightened attentional vigilance. In this sense, dental pain can be regarded as threat-related and thus might trigger the same processes, possibly explaining the null finding in this first study. Therefore, we decided to include a non-threat related topic in Study 2 that has been also used in TMT research, namely watching TV (e.g., Schindler & Reinhard, 2015a, b).

Study 2

Method

Subjects and Design

Study 2 was conducted in the lab and was not preregistered. Recruiting took place on a campus at a German university. According to the a priori power analysis in Study 1, the initial sample included 126 students from a German university, but a total of 17 data were excluded because participants reported technical problems with watching the videos during the study. The final sample included 109 students (51.4% female, 48.6% male) aged 18 to 46 (M = 23.05, SD = 4.07). Participants were randomly assigned to experimental between-subjects conditions (MS vs. TV control condition).

Procedure and Measures

Procedure and measures were the same as in Study 1, except that participants in the control condition had to answer the two open-ended questions on their emotions and thought about watching TV (“Please briefly describe the emotions the thought of TV arouses in you.”; “Jot down, as specifically as you can, what you think will physically happen to you as you watch TV.”).

Results

Messages Judged as True

Participants classified 58.53% (SD = 14.77) of the messages as true. This value is significantly different from 50%, t(108) = 6.03, p < .001, indicating a truth bias. Messages judged as true were not significantly affected by the MS manipulation, F(1, 107) = 0.00, p = .962.

Classification Accuracy

The overall accuracy rate was 46.88% (SD = 17.20), which did not significantly differ from 50%, t(108) = -1.89, p = .061. To test our hypothesis, we conducted a 2 × 2 mixed-model ANOVA, using classification accuracy as dependent variable, MS (vs. TV) as between-subjects factor, and truth status of messages (true vs. false) as within-participants factor. Classification accuracies of Study 2 for all groups are displayed in Table 2. Although the means were in the predicted direction, results revealed no significant main effect of MS, F(1, 107) = 1.19, p = .277, η2p = .01. Classification accuracy was not significantly lower than 50% in the MS condition, t(56) = -0.56, p = .578, however, this was the case in the TV control condition, t(51) = -2.40, p = .020.

Table 2 Means and standard deviations of accuracy of lie/truth classifications (in %) as a function of MS in Study 2

Overall, and corresponding to the truth bias, participants correctly classified more true messages as true (M = 55.41, SD = 23.51) than they correctly classified false messages as false (M = 38.35, SD = 21.80), F(1, 107) = 35.93, p < .001, η2p = 0.25. The interaction between MS and truth status was not significant, F(1, 107) = .00, p = .962, η2p = .00. Messages judged as true did not significantly correlate with classification accuracy, r = 0.08, p = .429. Controlling the ANOVA for video set did not noticeably change results.

Further Analyses

The mean judgment certainty was 75.28% (SD = 8.05). Judgment certainty did not significantly correlate with classification accuracy, r = -0.01, p = .946, and was not significantly affected by the MS manipulation, F(1, 107) = 0.10, p = .752. Furthermore, MS had no significant effect on the used cues (verbal cues: F[1, 107] = 3.90, p = .091; non-verbal cues: F[1, 107] = 0.61, p = .437) and no significant effect on participants’ self-reported state while watching the videos (attentive: F[1, 107] = 0.25, p = .617; vigilant: F[1, 107] = 0.01, p = .922; motivated: F[1, 107] = 1.76, p = .187; careful: F[1, 107] = 0.03, p = .874). Classification accuracy was not significantly correlated with any of the self-reported measures (verbal cues: r = -0.05, p = .602; non-verbal cues: r = 0.13, p = .168; attentive: r = -0.12, p = .208; vigilant: r = -0.14, p = .144; motivated: r = -0.11, p = .270; careful: r = -0.07, p = .456; experience: r = 0.09, p = .337; well-being: r = -0.01, p = .911).

Discussion

Results of this study do not support that MS proximally enhances the ability to accurately classify true and false messages. However, some partial results of Study 2 can be interpreted as initial evidence: The descriptive trend was as predicted in the way that classification accuracy in the MS condition was higher than in the TV control condition. Additionally, accuracy in the TV control condition was significantly below 50%, but this was not the case in the MS condition (note that absolute classification accuracy levels strongly depend on the used material).

No effects were found on self-reported use of verbal and non-verbal cues and on participants’ self-reported state of vigilance while watching the videos, again in comparison to a non-threat related control condition. It is debatable if such explicit measures can capture our assumed, rather subtle process, thus these results should be interpreted with caution.

Several aspects must be considered when interpreting the null findings of Studies 1 and 2. First, with the present sample sizes, we were able to detect a significant effect of f = 0.26 in Study 1 and an effect of f = 0.27 in Study 2 with sufficient power (80%). In a recently published work, Schindler et al. (2021) suggest that distal MS effects are very small. In this regard, the first two studies are underpowered. Second, another issue refers to the length of the proximal MS effect. To our knowledge, no work has ever systematically addressed the specific length of the distraction phase between the MS manipulation and the distal reaction. However, many distal effects of MS have been found after only one short distraction task. In 47.7% of all studies (Burke et al., 2010), the distraction tasks consisted of the Positive and Negative Affective Schedule (PANAS; Watson et al., 1988) or its expanded form (PANAS-X; Watson & Clark, 1992). Completion of these scales typically does not take longer than a few minutes. Thus, proximal reactions are unlikely to last while watching ten videos (i.e., about 30 min), as in our studies, so measuring veracity judgments likely exceeded the time frame in which proximal reactions can be assumed to occur. Presenting participants shorter videos within a shorter time frame seems more adequate. These issues were addressed in Study 3.

Because the first two studies probably suffer under low power, we decided to considerably increase the sample size in Study 3. Given previous evidence showing that MS immediately increased negative affect (Harmon-Jones et al., 2016; Lambert et al., 2014), and given that negative affect increased classification accuracy (Reinhard & Schwarz, 2012), we tested negative affect as possible underlying process for potential MS effects on lie detection accuracy.

Study 3

Based on what we learned from Studies 1 and 2, we applied some methodical modifications in Study 3. First, to reduce the overall time frame of the assessment of our dependent measure, we used different video material with a shorter average length of each video; in addition, we did not measure judgment certainty after each video. In doing so, we increased the probability to measure proximal reactions while watching the videos instead of distal reactions. Second, we massively increased statistical power. Third, we decided to collect data online, because it is possible to obtain larger sample sizes than by recruiting on campus. Finally, data collection in the lab is currently impossible due to the ongoing Corona pandemic. One could argue that effects like MS are sensitive to contexts and thus should only be conducted in the lab (i.e., highly controlled setting) to keep error variance low. However, producing preregistered MS effects online is in fact possible when taking measures to ensure data quality (e.g., Schindler et al., 2019; Vail et al., 2019; see also, Arias et al., 2020).

Fourth, considering the null findings in Studies 1 and 2, we excluded self-report measures on vigilance as well as on deceptive cues. Given the previous evidence that revealed MS to immediately increase negative affect (Harmon-Jones et al., 2016; Lambert et al., 2014), and given that negative affect increased classification accuracy (Reinhard & Schwarz, 2012), we included two self-report items on the actual affective state.

Fifth, we changed the topic of the control condition to avoid potential threat-relating thoughts about the Corona pandemic when thinking about watching TV. In line with the argumentation of Lambert et al. (2014), a non-threat related control topic is necessary for a fair test of the affect hypothesis. Therefore, we decided to ask participants to remind themselves of a situation in which they felt certain (certainty control condition).

Study 3 was pre-registered (https://aspredicted.org/ea4j6.pdf).

Method

Subjects and Design

The study was conducted in November 2020 and in English. Recruiting took place via Amazon Mechanical Turk. We conducted an a priori power analysis using G*Power (Faul et al., 2009). With an assumed power of 90%, setting Type I error rate at p < .05, and assuming a small effect size of f = 0.10, the power analysis for ANOVA (fixed effects, omnibus, one-way) revealed a minimum sample size of N = 1054 to detect a significant effect (given there is a true effect). Due to potential exclusions, we collected data from 1376 individuals.

In line with our pre-registration, the survey ended prematurely for participants who (a) did not give their informed consent for participation; (b) indicated they were younger than 18 years old; (c) indicated they were not a native English speaker; (d) indicated they did not currently live in the United States; (e) were unable to play the test video—respectively, those who incorrectly answered the question regarding the test video; and (f) failed the bot check. From the resulting 1376 participants, we subsequently excluded n = 38 who explicitly indicated that we should not use their data due to a lack of attention and random responding. Further, n = 44 participants were excluded because at the end of the study, they reported technical problems with watching the videos. The final sample included N = 1294 participants (52% female, 47.7% male, 0.3% divers), ranging in age from 18 to 79 (M = 36.96, SD = 12.26). Most participants indicated having a bachelor’s degree (43.5%), followed by a master’s degree (21.6%), no degree (13.8%), associate degree (8.4%), high school degree (8.3%), professional degree (2.2%), doctoral degree (1.5%), and 0.7% indicated they had achieved less than a high school diploma. Most participants reported being employed full time (60.4%). Participants were randomly assigned to experimental between-subjects conditions (MS vs. certainty control condition).

Procedure and Measures

Participants were instructed to play a test video to check whether their browser played the video and audio files. Then, they randomly received the typical MS (or certainty control) induction. Participants in the MS condition answered the same two open-ended questions as described above. Participants in the control condition answered the same questions regarding feeling certain (“What emotions does the thought of you being certain about yourself arouse in you?”; “What will happen physically to you as you feel certain about yourself?”).

Next, all participants were instructed to watch eight videos of eight persons telling truths or lies about a person they know. Participants were also told that some of the messages are false (i.e., the interviewed persons talk about other persons they like [dislike] as if they dislike [like] them), while other messages are true (i.e., the interviewed persons talk about other persons they like or dislike according to their true feelings). After each video, participants had to judge the truthfulness of each message on a binary scale (false vs. true). To increase reliability when measuring the ability of lie detection accuracy, participants were randomly assigned to one of four different sets of messages.

After they had judged all messages, participants were asked to rate two items on their actual affective state (α = 0.90; Bless & Burger, 2017). The scale ranged from 1 (very bad/sad) to 9 (very good/happy). We then assessed demographic variables (i.e., age, gender, education, employment status) and asked two self-formulated items on the current Corona pandemic. The first items asked if participants thought about the Corona pandemic during this study (yes vs. no). The second item asked to what extent participants personally feel threatened by the Corona pandemic in general. The scale ranged from 1 (not at all) to 7 (extreme). Participants then answered the bot check and finally were asked to answer the question if they experienced technical problems while watching the videos and if they paid enough attention when responding to prompts in the study. After participation, they received 0.30$.

Video Material

In this study, we used the true and deceptive messages of the Miami University Deception Detection Database (MU3D) created by Lloyd et al. (2018) in which targets were instructed to either speak honestly or dishonestly about their social relationships. The original database contains 320 videos of Black and White targets, female and male, telling truths and lies. To not trigger in- and out-group processes due to the MS-manipulation (e.g., Castano et al., 2002), we newly generated four sets, each containing eight videos, in which we only included White targets. Therefore, we randomly selected 32 videos from the pool of White targets. Status of message (lie vs. truth), target person’s gender (male vs. female), and valence of the message (positive vs. negative) were fully crossed across the sets. The average duration of a recording was approximately thirty seconds, thus watching eight videos resulted in an average duration of approximately 4 min.

Results

Messages Judged as True

Participants classified 64.38% (SD = 18.86) of the messages as true. This value was significantly different from 50%, t(1293) = 27.43, p < .001, indicating a truth bias. Messages judged as true were not significantly affected by the MS manipulation, F(1, 1292) = 0.22, p = .566.

Classification Accuracy

The overall accuracy rate was 52.44% (SD = 16.42). This was significantly different from 50%, t(1293) = 5.35, p < .001. To test our hypothesis, we conducted a 2 × 2 mixed ANOVA, with classification accuracy as dependent variable, MS (vs. certainty) as between-subjects factor, and truth status of messages (true vs. false) as within-participants factor. Classification accuracies for all groups are displayed in Table 3. Results revealed no significant main effect of MS, F(1, 1292) = 2.16, p = .142, η2p = .00. Classification accuracy was significantly higher than 50% in the MS condition, t(652) = 2.68, p = .008, however, this was also the case in the certainty control condition, t(640) = 5.00, p < .001.

Table 3 Means and standard deviations of accuracy of lie/truth classifications (in %) as a function of MS in Study 3

Overall, and corresponding to the truth bias, participants correctly classified more true messages as true (M = 66.83, SD = 23.89) than they correctly classified false messages as false (M = 38.06, SD = 26.08), F(1, 1293) = 752.62, p < .001, η2p = .37. The interaction between MS and truth status was not significant, F(1, 1292) = 0.33, p = .566, η2p = .00. Messages judged as true and classification accuracy were significantly negative correlated, r = -0.08, p = .001. Controlling the ANOVA for video set did not change results in terms of levels of significance.

Further Analyses

Participants reported a mean value of 6.66 (SD = 1.56) regarding their actual affective state, and this did not significantly differ between the MS condition (M = 6.66, SD = 1.53) and the certainty control condition (M = 6.66, SD = 1.60), F(1, 1292) = 0.01, p = .930. There was no significant correlation between the affective state and classification accuracy (r = -0.03, p = .325), but the affective state was significantly correlated with messages judged as true (r = 0.21, p < .001), indicating participants judge more messages as true when they are in a positive affective state. As preregistered, we further applied the Model 4 of the Process macro of Hayes (2013), using MS as the predictor variable, affective state as the mediator, and classification accuracy as the dependent variable. None of the paths of the mediation model reached significance (all ps ≥ .142), and bootstrapping the indirect effect (based on 5,000 re-samples) revealed that the 95% confidence interval include zero [-0.07, 0.09] indicating no mediation effect of the affective state.

Discussion

Results of Study 3 found no support for the idea that MS leads to increased classification accuracy of true and false messages. We massively increased the sample size compared with both prior studies, and we decreased the number of videos participants had to watch so that we approach to the time span proximal effects are assumed to occur. While the dependent measure in Studies 1 and 2 took place during a span of 30 min, the preparation time for the detection task in Study 3 was only about 4 min. In this regard, Study 3 provides a more proper test of proximal reactions after MS; however, since there is a lack of reliable findings about the time span in which proximal reactions change to distal ones, it cannot be completely ruled out that distal reactions still played a certain role here. We further included several response quality screening techniques (i.e., attention checks, bot checks) so that overall data quality can be seen as given (e.g., Schindler et al., 2019; Vail et al., 2019; see also, Arias et al., 2020).

We found no significant MS effect on the affective state. This is in line with classical TMT research that suggests MS effects are affect free (e.g., Pyszczynski et al., 1999, 2015), but it does not align with recently published work questioning the affect-free claim of TMT (Harmon-Jones et al., 2016; Lambert et al., 2014). In this study, to follow the methodological recommendations of Lambert et al. (2014), we included a certainty manipulation as a non-threat related control condition. However, letting people think of being certain of themselves might also trigger memories of a time or a situation when they actually felt uncertainty. Thus, this “neutral” control condition might not be completely free of threat, tension, conflict, or negative affect (see also Schindler & Trede, 2021). To check this possibility, we analyzed the content of the answers by applying the Linguistic Inquiry and Word Count (LIWC; Pennebaker et al., 2015), a computerized text analysis program. LIWC is a natural language processing tool that measures the relative occurrence of words from an embedded dictionary for a specific text input. For our analysis, we looked at negative emotion as the most important LIWC dimension. In addition, we looked at positive emotions, certainty, and death for further dimensions. For this analysis, we collapsed the answers on the two open questions in the MS manipulation. The LIWC scores are the counts of a specific dictionary (e.g., negative emotions) in an answer divided by the number of words of the answer. Results indicated that the mean LIWC score for negative emotion was significantly higher in the MS condition (M = 6.01, SD = 4.35) compared with the certainty control condition (M = 2.39, SD = 5.40); t(1292) = 9.77, p < .001, d = 0.54. Regarding positive emotions, the mean LIWC score was significantly lower in the MS condition (M = 5.86, SD = 15.60) compared with the certainty control condition (M = 19.30, SD = 19.30), t(1992) = 13.50, p < .001, d = 0.75. Regarding certainty, the mean LIWC score was significantly lower in the MS condition (M = 1.70, SD = 5.25) compared with the certainty control condition (M = 6.63, SD = 10.07), t(1292) = 11.07, p < .001, d = 0.62. Regarding death, the mean LIWC score was significantly higher in the MS condition (M = 3.13, SD = 3.61) compared with the certainty control condition (M = 0.07, SD = 0.56), t(1292) = 21.27, p < .001, d = 1.18. In sum, these results suggests that at least the content of the answers in the MS condition was clearly and substantially stronger related to negative emotions, uncertainty, and death than in the certainty control condition. However, this difference was not mirrored in the self-report measures of the affective state; this raises the question whether the rather subtle MS manipulation was strong enough to induce a lasting negative affect. To properly test the affect-free claim, research should apply at least three conditions where one group gets the classical MS-induction, one group gets the aversive dental pain induction, and another group gets an affect-neutral manipulation.

Because data collection of Study 3 was conducted in November 2020 during the ongoing Corona pandemic, there may be problematic context effects, potentially implying problems regarding a non-death related control condition. However, the above-mentioned results of the LIWC analysis on the death dimension suggest otherwise. Moreover, we argue that particularly during a time of crisis like this, thinking about death (vs. a control topic) might produce even stronger effects than in normal times where people are safe and can easily cope with a threat like MS, especially given that the MS manipulation can be seen as a rather subtle threat induction (cf. Pyszczynski et al., 2015). We further argue that in times of crisis, people are probably more sensitive to death and mortality, so that reactions by the subtle MS manipulation can be more easily triggered. Even after the pandemic (whenever that may be), people will be confronted in the news with threatening information about death, war, and violence. From this perspective, we can never be sure about a proper non-threatening control group.

General Discussion

The present studies addressed beneficial proximal effects of MS on the ability to correctly classify between true and false messages. We reasoned that MS could trigger (a) a state of heightened attentional vigilance and/or (b) a state of negative affect, with both states leading to more systemic information processing and thus an increase in classification accuracy. In all three studies, the effect of MS on lie detection accuracy was not significant. Regarding the assumed underlying processes, in Studies 1 and 2, we asked participants to indicate their level of vigilance and asked them to indicate the extent to which they relied on verbal and non-verbal cues when classifying the messages. No significant MS effects occurred on any of these self-report measures. In Study 3, we then focused on negative affect as potential explanation, but again, no significant MS effect was found on the included items on the affective state.

The present findings clearly do not provide any evidence for the proposed idea. However, with this work, we want to document the research process and the challenges in addressing this idea rather than provide conclusive evidence. So, what can be learned from this research? In the following we discuss methodological as well as theoretical aspects.

Methodological Aspects

Statistical Power

The most comprehensive meta-analysis to date yielded a moderate to strong effect of MS (f = 0.37; Burke et al., 2010). Although we calculated our power analysis for Studies 1 and 2 with a more conservative effect (f = 0.26), based on a recent work (Schindler et al., 2021), this effect size is still likely to be overestimated. Studies 1 and 2 were therefore underpowered regarding small effects, so the null findings have little evidential value. In Study 3, statistical power was high to detect even small effects (f = 0.10). Still, no significant effect of MS on lie detection accuracy was found. As the classical MS manipulation is seen as a rather subtle threat induction (cf. Pyszczynski et al., 2015), future research should explore the idea by using stronger and more clearly validated threat manipulations.

Proximal Versus Distal Reactions

We based our prediction on proximal instead of distal reactions after MS. However, little is known with any certainty about the length of the proximal stage. It remains speculative how long the detection task should last to still count as a proximal and not a distal reaction. In hindsight, the detection task preparation of approximately 30 min in Studies 1 and 2 seems too long, given that most studies on TMT expect distal reactions were already found after a few minutes. Therefore, we reduced the number of videos in Study 3 (leading to a task preparation time of approximately 4 min). Future research should systematically investigate in which time span proximal reactions change to distal ones. Importantly, the present research has no implications for the empirical validity of the MS hypothesis, as this hypothesis explicitly refers to distal reactions.

Attentional Vigilance and Negative Affect

Neither our included self-report measures on attentional vigilance (Studies 1 and 2) nor on negative affect (Study 3) were significantly affected by our MS manipulation. Regarding the mechanism of heightened attentional vigilance, explicit assessments as in Studies 1 and 2 may prove problematic given that such a state may be unconscious and not captured by introspection. We recommend future research to rely on physiological measures (e.g., Klackl & Jonas, 2019).

Analysis of the answers for the two open-ended questions of the MS manipulation in Study 3 revealed stronger negative emotions in the MS condition compared with the certainty control condition. Although this supports the assumption that MS induces negative affect, this effect still might not have been strong enough to actually influence information processing. Further research should thus address and validate the threat manipulation (and its control group) regarding the processes of vigilance and negative affect.

Theoretical Aspects

The idea that existential threat proximally leads to better classification accuracy of potentially false messages is based on several theoretical assumptions. First, MS was assumed to lead to heightened attentional vigilance and negative affect. Both states were assumed to lead to more elaborated information processing, in our case potentially false messages. This should further lead to better classification accuracy of the messages. Each of these assumptions stands (more or less) on shaky empirical grounds. It is unclear whether MS as the applied threat is actually linked to a state of attentional vigilance or whether the threat of death predominantly induces the motivation to suppress or distract oneself from the threat. It seems more plausible that threat only increases vigilance towards threat-related stimuli. In this case, the content of our used videos would be irrelevant to a potentially MS-induced state of heightened vigilance as none of the videos dealt with threat. While assumptions on the vigilance-based mechanism seem speculative, the affect-based mechanism seems more straightforward. However, in contrast to previous research (but in line with most existing MS studies), we found no effect of MS on self-reported affect; thus, it can be questioned whether MS is an adequate threat induction to properly test an affect-based prediction. While the assumption that negative affect is linked to more elaborated information processing can be (so far) regarded as established (Schwarz, 2012), evidence for a beneficial effect of negative affect on lie detection accuracy was only provided in three studies (Reinhard & Schwarz, 2012) and thus can still be considered preliminary (especially given that lie detection effects are sensitive to the used material; Levine et al., 2022). To further investigate the idea of the present research, we recommend focusing on threat manipulations that clearly can be related to negative affect. Only with clearly validated materials can research ideas be properly tested. So far, the theoretical elaboration and justification remain speculative.

Conclusion

The present work documents our investigation of proximal effects of existential threat on the process of lie detection. None of the three conducted studies provided any significant MS effect on lie detection accuracy. However, these null findings should not be overstated. Instead, the present contribution aims to reveal the theoretical and methodological challenges in properly testing proximal MS effects on lie detection accuracy. Thus, this work aims to be informative for conducting improved future research rather than provide conclusive evidence against or in favor of the investigated idea.