1 Introduction

Digital health interventions have been increasingly used for persuading people to change health-related behaviors. Digital health interventions can be deployed to a large population at a reduced cost and increase the convenience for the user to access health information [17]. However, digital health interventions, such as text-based interventions, lack the positive attributes of interpersonal communication during face-to-face interaction. The use of interpersonal communication skills, such as verbal and non-verbal behaviors, increase the effectiveness of information [39]. The need for interpersonal communication in digital health interventions has motivated researchers to use interactive Virtual Humans (VH). VHs are computer-generated 3-D characters that look and behave like humans Fig. 1. VHs use verbal and non-verbal behaviors, such as empathic dialogs [30], hand gestures [44], etc. for effective communication.

Fig. 1
figure 1

Examples of virtual humans delivering the intervention using different gestures. Virtual humans with different gender and race—A: White Male, B: Black Female, C: White Female, D: Black Male

The evidence from prior work suggests that VHs can positively influence health-related user behaviors, such as weight loss [33], increased physical activity [13, 44], etc. To develop such effective VHs in healthcare, VH developers are required to make several design decisions, such as VH’s appearance, role, language, etc. To help VH developers make design decisions, there is a need for evidence-based design guidelines. The design guidelines can help VH developers to design applications that can positively influence user health-related behaviors (Fig. 1).

To recommend design guidelines for VH developers, prior work has studied the influence of visual design on user perceptions of the VH [37, 43]. The user perceptions of the VH, such as likability, friendliness, trustworthiness, etc. are affected by a change in VH’s visual design [43]. In serious task domains, such as medicine, the users perceive visually realistic VHs [43] with role appropriate attire [37] as more suitable for delivering medical information compared to less realistic VHs (e.g., cartoony). The prior work has largely speculated that the user perceptions of VH’s visual design in the medical domain are influenced by users’ mental models of medical professionals [43]. The current research aims to understand users’ mental models that affect their perceptions of a VH’s visual design in healthcare interventions.

To identify the influence of the information medium in the context of healthcare interventions, prior work has compared VHs with the existing standard of text [33, 44, 49]. However, the results are inconsistent in previous studies. Few studies have found no differences between VH and text on user health attitudes [33, 44]. However, there also exists evidence that shows the text was more effective in influencing user-health attitudes compared to VH [49]. The current research aims to further examine the influence of the information medium in the context of promoting colorectal cancer (CRC) screening.

The language used in health communication can affect health outcomes, such as adherence among patients [28]. Therefore, it is important to analyze and improve the language used in the VH communication. To improve the language used in VH communication, a user-centered iterative design process was used [16]. The current research aims to quantitatively analyze how the linguistic characteristics of VH communication improved during the user-centered iterative design process.

The current paper aims to recommend design guidelines by examining the influence of VH’s visual design, the influence of the information medium, and the linguistic characteristics of the VH communication in the context of delivering information related to CRC. CRC is the second leading cause of cancer deaths among men and women in the United States [41]. However, CRC mortality rates can be halved by ensuring patients’ adherence to the current screening guidelines [12]. Adherence to the current screening guidelines can be increased by addressing several barriers, including but not limited to: the lack of doctor recommendation to get screened, the public’s lack of awareness about screening options, the fear of cancer, embarrassment, and stigma [24].

To address CRC screening-related barriers, the research team designed a VH-based intervention called Agent Leveraging Empathy for eXams (ALEX). ALEX uses VHs with verbal and non-verbal behaviors to deliver CRC screening information. The paper discusses the testing of ALEX in two different studies. In study 1, ALEX was initially deployed to users in a focus group setting during the prototype development phase of the intervention design. In study 2, the ALEX was tested with users drawn from an online research panel. As a result of these two studies, guidelines were developed for future researchers and developers to use when designing VH-based health interventions. The work presented here is part of a multi-year research effort that aims to use VH-based intervention to increase CRC screening among underserved patients, including racial and ethnic minorities and those living in rural communities [16].

This work contributes the following to research on VH-based interventions: (1) insights into users’ mental models that affect their perceptions of a VH’s visual design based on a qualitative analysis of focus group transcripts, 2) an empirical user study comparing the influence of the information medium (animated VH and static VH with text) on user intentions to pursue the health topic further. The paper provides evidence that an animated VH can be used to positively influence users’ intentions to pursue health topic further outside of a controlled lab setting, and 3) a linguistic analysis to show that the involvement of community members improved the readability and emotional tone of the language used in VH-based interventions.

In the remainder of the paper, we review related work and describe our system design before presenting two studies, linguistic analysis of VH communication, and a discussion of results from our analyses.

2 Related work

This work builds upon existing research in evidence-based guidelines on the visual design of VHs, motivational VH-based interventions in healthcare, linguistic characteristics of physician communication, and current ways to deliver healthcare interventions for increasing CRC screening rates.

2.1 Visual design guidelines

Several studies have been conducted to understand the effects of VH’s visual design on user perceptions of the VH [35, 37, 43]. For example, Ring et al. evaluated the effects of a VH’s appearance and application domain on user perceptions of the VH [43]. The authors compared realistic VH (e.g., human-like) with less realistic VH (e.g., cartoony) across two domains: social and medical. The VH in medical condition discussed cancer-related topics, and VH in social condition discussed user’s favorite books and movies. The authors suggest the use of realistic VHs as more appropriate for a medical domain compared to less realistic VHs. However, the use of less realistic VHs was recommended for a social domain. The authors speculate that the user’s mental model of healthcare professionals could have influenced user perceptions of VH in the medical domain. Similar prior work by Parmar et al. studied the effect of VH’s attire and virtual environment on user perceptions of the VH health counselor [37]. The authors compared different types of VH attires (e.g., whitecoat, scrubs, formal, casual, etc.) with and without stethoscopes. The authors also compared two different virtual environment settings: empty room and full doctor’s office with bench, medical equipment, and sink. The authors found that the VHs with role-appropriate attire (e.g., white coat and stethoscope) are found to be trustworthy, appropriate for the job, professional, and persuasive when discussing a medical topic. The authors did not find the effect of the virtual environment on user perceptions of VH. The authors suggest that the effect of physician attire on patient perception carries over to virtual physicians too. The current paper aims to contribute evidence towards existing literature on visual design guidelines for developing VHs in the healthcare context.

2.2 Virtual humans in healthcare

Several studies have compared a VH-based intervention to the existing standard of text-based intervention to understand the influence of information delivery medium on user health-related attitudes and have found inconsistent results [13, 33, 44, 49].

Few studies have found no difference between VH and text medium on user health-related attitudes [13, 33, 44]. For example, Schulman et al. used a VH with synthesized speech and nonverbal behaviors to increase the user’s positive attitudes towards exercise [44]. The authors were successful in influencing users’ positive attitudes towards exercise using persuasive dialogue with a VH. However, the authors found no significant differences in persuasiveness between the text-based interface and the VH. Similar results were observed in the study conducted by Manuvinakurike et al. to change user attitudes towards weight loss using behavioral change stories from the internet [33]. The authors compared two different mediums of delivering the behavioral change stories - text and animated VH. Animated VH verbally presented the behavioral change stories using synthesized speech and non-verbal behaviors. The authors report that the users in the animated VH condition rated higher understandability of stories compared to the text condition. However, users enjoyed more and self-identified more when they read stories via text. Also, the authors found no effect of medium on user attitudes towards weight loss. Similarly, Friederichs et al. used an Internet-based intervention to discuss the benefits of being physically active with text or text and a VH image [13]. The authors found that both the text-only intervention and the text intervention with a VH image significantly increased self-reported physical activity after one month compared to users who did not receive the intervention. However, there was no difference in self-reported physical activity between the text-only intervention and the text intervention with a VH image.

In research by Tielman et al., the authors compared two different modalities to deliver mental health intervention on expressive writing - a VH image with synthesized speech and text [49]. The authors measure user adherence to the VH’s recommendation of describing a negative memory in detail. Unlike prior work, which found no difference between VH and text-based intervention, the intervention via text resulted in higher adherence to the VH’s recommendation than a VH image with synthesized speech.

There also exist comparison studies where the effect of VH and text-based intervention on user health attitude was not measured [4, 30]. For example, in the context of handheld computer agents, Bickmore et al. used a VH interface to deliver health information on exercise and diet promotion using four different output modalities - text only, a static image with text, animated VH, and animated VH with nonverbal speech (e.g., “uh-huh” and “oh”) [4]. The authors found that users rated health information as more credible when delivered via an animated VH. However, the effect of health information on user attitude towards exercise and diet promotion was not measured. In a similar study by Lisetti et al., an Internet-based VH intervention was used to reduce alcohol consumption [30]. The authors found that an empathic VH with nonverbal behaviors improved users’ attitudes towards the VH compared to the text-based intervention. However, the authors did not compare the effect of different intervention modes on the outcome or the change in the user’s attitude towards alcohol consumption.

The current paper aims to contribute evidence towards existing literature on the influence of information delivery medium on user health-related attitudes.

2.3 Linguistic characteristics of healthcare interactions

In the context of healthcare, the linguistic characteristics of physician communication are correlated with health outcomes, such as patient adherence [9]. The linguistic characteristics such as first-person pronouns, positive emotion words, and present tense words are more common in healthcare interactions compared to plural pronouns, negative emotion words, and past or future tense words, respectively [9]. To understand the linguistic characteristics of physician communication, prior work has used Linguistic Inquiry Word Count (LIWC) software [48]. LIWC provides a method to analyze the textual content across multiple predefined word categories such as positive and negative emotions, analytical thinking, etc. [48] LIWC has been extensively used in analyzing word usage between texts across different domains [25, 36]. In the context of physician communication, Falkenstein et al. identified the linguistic characteristics of physician communication using LIWC [9]. The authors analyzed physicians’ conversations with patients assessing a variety of medical reasons, such as colon surgery, breast surgery, etc. The authors analyzed patient conversations with six different physicians across nine dimensions: Total words, Singular first-person pronouns, Plural first-person pronouns, Past tense words, Present tense words, Future tense words, Positive emotion words, Negative emotion words, and Cognitive process words. The authors found that the physicians were more liked by patients when fewer negative emotion words (e.g., worry, risk, serious) were used by physicians. Also, the users were less likely to adhere to physician recommendations when physicians used more singular first-person pronouns (e.g., I, me, mine).

The user comprehension of health-related information is dependent on the readability of the information [26]. To evaluate the readability of the CRC information available online, prior work has used the Flesch-Kincaid Grade Level test [2]. The Flesch-Kincaid Grade Level test provides a score as a United States school-grade level [11]. The Flesch-Kincaid Grade Level test score identifies the school-grade level in which an average student can understand the given text. The US Department of Health and Human Services (USDHHS) categorizes sixth-grade reading level as easy to read, seventh-to-ninth grade reading level as average difficulty, and anything above tenth-grade reading level as difficult [1]. The readability of the CRC information available online is above 10th grade, which is considered as difficult to read [2].

Since VH communication in health interventions are similar to physician communication and can affect health outcomes, the linguistic characteristics of VH communication are important to be evaluated. Therefore, to evaluate VH communication, the current paper uses LIWC and Flesch-Kincaid Grade Level test.

2.4 Colorectal cancer screening and barriers

Several barriers contribute to lower CRC screening rates. The major barrier that affects CRC screening is the lack of doctor recommendation to be screened [24]. Although doctors are aware of the benefits of recommending CRC screening, time constraints when meeting with patients limit doctors’ ability to recommend all preventive services to patients [24]. Also, previous research shows that people’s lack of awareness about the benefits of screening and the availability of screening options is another significant barrier affecting screening rates [24]. People also avoid CRC screening due to disgust and social stigma [24]. The stigma associated with performing stool or rectal examination during the CRC screening often leads to screening avoidance [40]. Other barriers include fear of cancer, embarrassment, failure to schedule screening tests, difficulty in achieving proper shared decision-making at the point of care, and fatalism among underrepresented minorities.

Several decision-aids (e.g., pamphlets, digital media, and one-to-one interventions) have been developed to address barriers related to CRC screening [15]. Conventional online text-based interventions, pamphlets, and video clips can address the barrier of lack of patient awareness about CRC. However, conventional intervention methods cannot address the lack of physician recommendation. Whereas one-to-one interventions with experts (doctors) can address most of the barriers. But the time constraints on doctors limit the number of people going through the intervention.

In our work, the screening barriers being overcome are: (1) the lack of physician recommendation, (2) the lack of patient awareness about CRC and CRC screening, and (3) users’ fear and embarrassment of discussing CRC. Our approach to reducing the screening barriers includes: (1) introducing VH as a one-on-one conversation with a “virtual doctor”, (2) informing patients about CRC and screening-related information, and (3) the VH’s dialog with the user was non-judgmental [31] and explained CRC topics.

3 System design

The intervention was created using the Unity3D game engine [6]. To enable access to the intervention on both mobile and personal computers, the intervention was designed as a web application. The web interface consisted of four components: the VH, text captions, a conversation log, and the interaction buttons. The on-screen interaction buttons were used to take user inputs. After testing with focus groups, the conversation log was eliminated. The text captions with the recommended font size [51] were included to assist users with low hearing abilities (refer to Fig. 2a for user interface).

Demographic discordance between a patient and a healthcare provider reduces patient satisfaction with healthcare and can lead to lower adherence rates to health recommendations [47]. Therefore, four versions of VH were developed so that users could be matched with a VH of the same race and gender: (1) Black Female (2) Black Male (3) White Female (4) White Male as shown in Fig. 2b. The VHs were created using Adobe Fuse by the experimenters. The VHs’ voices were recorded by a professional voice talent. Voice talent with the same race and gender as the VH was chosen to match appearance and voice identity. VHs did not use non-verbal behaviors during the interaction in study 1. In study 1, the VH was in a sitting pose during the intervention and had an idle breathing animation. After the feedback from users in study 1, non-verbal behaviors were added to VH in study 2. Non-verbal behaviors were captured using a Vicon Motion Capture System with human actors. A 3D model of a clinical exam room was used for a virtual environment based on the model’s resemblance to local clinical rooms.

The research team designed the content for VH communication in consultation with physicians, communication scientists, and community members. The team included representatives from computer science, oncology, and family medicine. The team also included members of the project’s community advisory board, which was comprised of patient groups, advocates, and medical professionals. The community members provided valuable feedback on the VH communication initially developed by physicians and communication scientists to reduce the complexity of content for those with low health literacy. The VH communication is based on ten factors communication scientists have demonstrated to impact cancer screening: message source [20], severity [5], risk probability [14], susceptibility [5], framing [10], benefits [7], response efficacy [8], barriers [18], narrative persuasion [22], and self-efficacy [23]. The research team developed the content for VH communication iteratively by reviewing with various stake-holders after each draft and incorporating their feedback. These iterations helped improve medical accuracy and user acceptability.

Table 1 Demographic information for two studies

4 Study 1

A series of focus groups were conducted as a part of an ongoing larger research effort that aims to use ALEX to increase CRC screening. The focus groups were conducted to aid in the initial development of ALEX before deploying outside of a lab setting. The process involved users interacting with ALEX, filling out a survey, and then participating in a focus group. To understand the user perceptions of ALEX, the current work analyses user feedback from the initial thirteen focus groups conducted during the prototype testing phase of the development (during Fall 2017).

4.1 Participants

To ensure that all users will be within the recommended age range (50-75 years) for screening [12] during the entire duration of the study, ALEX was tested with potential target users within the age range of 50 and 73. To conduct a demographically targeted intervention, the research team needed users who could be demographically matched to our VHs. Therefore, seventy-three participants (21 males, 51 females, 1 not specified), who self-identified as either White or Black, were recruited to participate in the focus groups (refer Table 1 for demographic information). The users were recruited through an online program that collects contact information of patients at a local health institution who have agreed to be contacted for qualifying research studies. The users were also recruited through a local senior center. A local senior center was used as a location for user testing. The local senior center was a location that was familiar and readily accessible to the participant pool. The research team also partnered with a university program, which seeks to engage the public in research and healthcare efforts.

4.2 Measures

In our analysis, measures were chosen to align with CRC barriers: the lack of physician recommendation and the lack of patient awareness (refer Sect. 2.4 for the description of barriers). A 5-point Likert scale (1 = Strongly disagree to 5 = Strongly agree) was used to measure users’ intention to discuss screening options with a doctor and learn more about CRC. The measures include: (1) This application made me want to discuss colorectal screening options with my doctor. (2) This application made me want to learn more about my risk for colorectal cancer. Other self-reported measures were collected. For the scope of this paper, the analysis focuses on measures related to the influence of ALEX on users’ intention to discuss CRC options with a doctor and learn more about CRC.

4.3 Procedure

The goal of the focus groups was to understand the user perceptions of ALEX and discussion about home stool testing recommended by ALEX. Because home stool testing might be considered to be a sensitive topic for conversation, focus groups were conducted separately for male and female participants. Therefore, each session was grouped by user gender and had a moderator and co-moderator who matched the gender of the users.

The moderators collected informed consent from the users and explained the procedures. The focus group participants began by filling out the first part of a questionnaire with items measuring attitudes towards CRC and screening and demographics. After users answered these questions, the users interacted with gender-matched ALEX on a mobile phone provided by the research team. During the intervention, users saw ALEX verbally recite the intervention content for approximately 12 minutes. Users provided information to the VH by selecting between multiple-choice responses during the intervention. For example, users responded to “Do you eat red meat?” by selecting from on-screen options, such as “Yes” and “No”. All users were recommended to get screened for CRC using the Fecal Immunochemical Test (FIT). The FIT was demonstrated by an animated video. The FIT was chosen because the test can be carried out at home without the need for professional assistance; is less expensive than other home stool tests; and is usually covered by insurance. The users were recommended to discuss the CRC screening test with their doctor. After completing the interaction, they responded to additional questions regarding their opinion of the intervention and the VH. Then, they engaged in a group discussion about the intervention and screening. The focus group involved discussions about user perceptions of: (1) VH’s appearance, (2) VH’s voice, (3) VH’s animations, (4) information provided in the intervention, (5) accessing the health information online, and (6) overall application. The focus group also involved discussion about the users’ intentions after using the intervention.

4.4 Quantitative results

To understand the influence of ALEX on users’ intention: (a) to discuss CRC screening options with a doctor and (b) to learn more about CRC screening options, questionnaire data from focus group participants was analyzed. To analyze the non-parametric Likert scale data, the One-Sample Wilcoxon Signed Rank test from SPSS was used. Since some users chose not to complete some measures, the pairwise deletion method to handle the missing data was used.

4.4.1 User intentions to discuss with the doctor

A One-Sample Wilcoxon signed-rank test showed that users reported high intentions to discuss CRC screening options with their doctors (n = 72, median user rating = 4) after going through the intervention, and their scores were significantly higher than a score of 3 (Z = 6.087, p < 0.01).

4.4.2 User intentions to learn more

A One-Sample Wilcoxon signed-rank test showed that users also reported high intentions to learn more about CRC screening options (n = 71, median user rating  =4) after going through intervention, and their scores were significantly higher than a score of 3 (Z = 5.333, p < 0.01).

In addition to examining self-reported intentions, we also examined the focus group transcripts and found that users also verbalized these intentions in the group discussions (refer to Sect. 4.5.4 for more details).

Fig. 2
figure 2

a VH user interface with text captions for study 2 b Upper row- Improved high fidelity VHs used for study 2. Bottom row - VHs used for study 1. Four versions: (1) White Male (A & E) (2) Black Female (B & F) (3) White Female (C & G) (4) Black Male (D & H)

4.5 Qualitative results

The visual design of the VH can affect user perceptions of the intervention [37]. To understand the influence of the visual design of ALEX on user perceptions, a qualitative analysis of the focus group transcripts (n = 13 focus groups with 73 users) was conducted. A thematic analysis was followed using an open coding method to identify and cluster codes into common themes. The major themes that emerged centered around trust, user expectations from the VH role, and the influence of VH’s realism.

4.5.1 Role of organizational branding

Consumer health portals use organizational branding to build trust with users [32]. Likewise, ALEX used the logo of a local healthcare provider to build trust with users. When users were asked what made the information trustworthy in the focus groups, users (in 9 out of 13 focus groups) reported association with the local healthcare provider made the information trustworthy. This theme is illustrated by the following quote: “..., I think we tend to maybe, stupidly believe that it’s [app] gonna be correct, but uh, I have no reason to doubt [name of local healthcare provider].” [P71]. And “Yeah, it’s coming from a credible source, which is [name of local healthcare provider], which we’re all familiar with. Yeah, I think that’s the main reason it’s trustworthy. You know why would they do something that was gonna harm us.” [P72]. This observation was not seen across all demographic groups and additional research is necessary to explore how to best use organizational branding.

4.5.2 Expectations associated with the VH role

Prior work has shown that introducing agents as a specialist enhances user perceptions of information credibility [27]. Therefore, the VH was introduced as a virtual doctor to increase information credibility. However, introducing the VH as a doctor set user expectations for VH to look and behave like a doctor (in 7 out of 13 focus groups). This theme is evident in the following quotes: “I hadn’t seen many doctors dressed that way. He had his shirt tail out, ...” [P19]. And, “Doctors are professionals. They dress appropriately. Doesn’t matter what the age.” [P12]. When these visual expectations were not met by the VH, it affected users’ perceptions of the VH. Overall, the results indicate the influence of users’ real-world experiences with healthcare professionals on their perceptions of VH.

Table 2 Overview of different study 2 conditions

4.5.3 Influence of VH realism

Users in 11 out of the 13 focus groups commented on the VH’s visual realism. For example, “I thought that the person in the app wasn’t very realistic. It looked plastic.” [P57]. Users also discussed abnormalities in animations. For example, “it was clearly somewhat awkward, herky-jerky.” [P2] and “Mouth movement fit pretty well with what was going on, but it was not as crisp as it could be.” [P71].

Some users compared the VH to characters from commercial games. This comparison is illustrated by the following quote: “...it lacked clarity that modern graphics could provide” [P71]. When asked about initial impressions of the application, most users focused their attention on VH’s visual design rather than the CRC information provided from the intervention.

4.5.4 Information usefulness

When asked about what would users do after the intervention, users in 9 out of the 13 focus groups felt that the intervention was helpful and showed the intention to discuss CRC screening options with a doctor, for example, “Well, I didn’t have any background, this is the first time I’ve heard this. ... I’m familiar with the colonoscopy and the blood test, I’d like to talk to the doctor about these three options [including FIT test].” [P6]. And, “It [app] had a lot of information on it that opened my eyes to some things to ask my doctor.” [P51]. Also, some participants expressed their desire to discuss with their doctor before making a decision. For example, “I would think that that would be a good recommendation to not rely strictly on the information here but go and talk to your doctor about this as an alternative.” [P5].

4.6 Limitations of study 1

The limitations include—(1) the focus groups were held with users who were already enrolled in medical research registries or who responded to an invitation to participate in a focus group regarding medical issues. It is possible that such participants have higher levels of health literacy or are simply more interested in health topics than the general population. (2) Although users reported their intentions to discuss CRC screening options with a doctor, the research team did not do a longitudinal follow-up with users to verify if they screened for CRC. However, this limitation is addressed in our ongoing clinical study, where the research team is distributing the intervention via an existing healthcare portal. The clinical study is designed to allow users to order screening tests after the intervention.

5 Study 2

Based on the focus group feedback, the design of ALEX was iterated for study 2. To address user concerns with realism, the VHs were redesigned to increase their visual and behavioral realism. Visual realism was improved using high fidelity characters, as shown in Fig. 2b. Behavioral realism was improved using non-verbal hand gestures captured using a Vicon Motion Capture System with human actors. The VH used pre-recorded voice and animations to deliver the information. To manage user expectations about the VH’s role, the VH is introduced as a virtual healthcare assistant instead of a virtual doctor. The VH attire was changed to match user recommendations and the healthcare assistant role by adding a lab coat and an identification badge.

The analysis for study 2 focused on two different components: (1) understanding the influence of information medium on user intentions to pursue the CRC health topic further and (2) understanding the effect of visual framing on user intentions to pursue the CRC health topic further.

Table 3 Descriptive statistics for different conditions in study 2

5.1 Influence of information medium

Developing a VH-based intervention requires additional resources compared to other forms of information delivery mediums, such as text. However, if users are more influenced by the information from an animated VH, then the investment in additional resources could be justified. To understand the influence of different mediums of information delivery, animated VH condition (Anim-VH) was compared with another condition—text with a static image of a VH (StaticVH-Txt). The users in the Anim-VH condition saw an animated VH, verbally going through the script developed for the intervention (refer Sect. 3 for more details on the script development). The users in the StaticVH-Txt condition saw a textual presentation of the same script with a static image of the VH. Each dialog was accompanied by a static image of a VH with different hand gestures, as shown in Fig. 2b. In both conditions, on-screen buttons were used to take user inputs.

The research team included a control condition that did not include the script developed for the intervention and did not receive the intervention from VH. The control condition (CTRL) lacked VH and script developed for the intervention. The users in the CTRL condition saw an image of a cartoon computer with a stethoscope and the information presented textually with different textual content. The textual content included information about nutrition and cancer prevention based on the American Cancer Society’s guidelines [52]. Table 2 provides an overview of study conditions.

5.2 Effect of visual framing

A secondary hypothesis examined if the VH’s visual framing influenced users’ intentions to pursue the CRC health topic further. A VH is a human-like actor delivering the information in front of a camera. Visual framing of the VH conveys different information to the viewer based on the content visible in a frame [21]. For example, a medium shot (showing a VH’s upper body, arms, and head) is an approximation of how close someone would be in real-world conversations. A medium shot can be used to convey the feeling of intimacy and engage with users on a personal level. While a medium-long shot (shows VH from head to knees and surrounding) can be used to show the physical setting around VH to convey contextual information. The information conveyed to users may affect user perceptions [21]. To understand which visual framing is appropriate for our intervention, two additional conditions were added as an exploratory variable: near and far. The near and far conditions were based on medium and medium-long shots, respectively. The users in the near condition saw a VH sitting closer to the camera (showing the VH’s upper body, arms, and head—Fig. 2a) whereas users in the far condition saw a VH sitting farther away from the camera (showing the VH from head to knees and surrounding—Fig. 2b).

5.3 Participants

Based on the recommended age range for getting screened for CRC [12], users (n = 1400, 700 males and 700 females) aged 50-73 years were recruited during Fall 2018. The users were recruited from an online participant pool provided by the Qualtrics online survey platform [42]. The study was conducted online to match the intended real-world use case. The demographic information of users is shown in Table 1.

5.4 Measures

The analysis uses the same measures used in study 1 to align with CRC barriers: the lack of physician recommendation and the lack of patient awareness (refer Sect. 2.4 for a description of barriers). A 5-point Likert scale (1=Strongly disagree to 5=Strongly agree) was used to measure users’ intention to discuss screening options with a doctor and learn more about CRC. The measures include: (1) This application made me want to discuss colorectal screening options with my doctor. (2) This application made me want to learn more about my risk for colorectal cancer.

To understand overall user perceptions of the intervention, the users were also asked an open-ended question at the end—“In the space below, please provide some thoughts on the virtual appointment you had and the questionnaire you filled out.”

5.5 Procedure

Users were randomly assigned to one of the four conditions—Anim-VH near, Anim-VH far, StaticVH-Txt, and CTRL. Users in Anim-VH (both near and far) and StaticVH-Txt condition were shown a gender-concordant character based on pre-screening questions. Users were randomly shown either a race-concordant or discordant character in both Anim-VH and StaticVH-Txt conditions. The study was designed to match the ideal use case of our intervention: people would receive the intervention through an online link outside of a lab setting. During the intervention, users were recommended to get screened for CRC using the FIT. The users were recommended to discuss the CRC screening test with their doctor. After the intervention, users completed a post-questionnaire survey that involved measures related to the influence of ALEX on users’ intentions and an open-ended question about user perceptions of ALEX.

5.6 Results

The main goal of this analysis was to understand the influence of information medium on user intentions to pursue health information further. To analyze non-parametric Likert-scale data, the non-parametric Kruskal-Wallis H and Mann-Whitney U tests from SPSS were used. The Qualtrics data from fifty participants were excluded as they filled the survey without going through the full intervention. The descriptive statistics for different conditions in study 2 are reported in Table 3. The effect of racial concordance or discordance on study outcomes is outside the scope of current analysis and will not be reported.

5.6.1 User intentions to learn more

A Kruskal-Wallis H test showed that there was a statistically significant difference in users’ self-rated intentions to learn more about CRC screening options between the Anim-VH, StaticVH-Txt, and CTRL condition, H =12.411, p = 0.002. Post-hoc tests (Dunn’s test) carried out on each pair of conditions revealed that there was a significant difference between the Anim-VH condition and the CTRL condition, Z = 3.335, p = 0.003 (adjusted using Bonferroni correction for multiple tests). The effect size for the analysis (d = 0.107) was found to be low compared to Cohen’s convention for a large effect (d= .80). The results indicate that users in the Anim-VH condition reported greater intention to learn more about CRC screening options than compared to the CTRL condition after the intervention. There were non-significant differences between the other groups.

5.6.2 User intentions to discuss with the doctor

A Kruskal-Wallis H test showed that there was a statistically significant difference in users’ self-rated intentions to discuss CRC screening options with a doctor between the Anim-VH, StaticVH-Txt, and CTRL condition, H =18.052, p < 0.001. Post-hoc tests (Dunn’s test) carried out on each pair of groups revealed that there was a significant difference between the Anim-VH condition and the CTRL condition, Z = 3.973, p < 0.001 (adjusted using Bonferroni correction for multiple tests). The effect size for the analysis (d = 0.128) was found to be low compared to Cohen’s convention for a large effect (d= .80). There was also a significant difference between the StaticVH-Txt and the Anim-VH condition, Z = \(-2.505\), p = 0.037 (adjusted using Bonferroni correction for multiple tests). The effect size for the analysis (d = 0.074) was found to be very low compared to Cohen’s convention for a large effect (d= .80). These results indicate that users in the Anim-VH group reported greater intention to discuss CRC screening options with a doctor than compared to the StaticVH-Txt and the CTRL condition after the intervention.

To validate our approach of comparing with a score of 3 in study 1 analysis (refer Sect. 4.4 for study 1 analysis), the median score for user self-reported intentions in CTRL condition in study 2 were analyzed. The CTRL condition had a median score of 3 for self-rated intentions to discuss CRC screening options with a doctor. The finding suggests that users who did not go through the VH intervention rated a median score of 3 and supports our approach to compare scores of users who went through VH intervention with a score of 3 in the study 1 analysis.

Table 4 Comparison of LIWC results for VH communication with physician communication

5.6.3 Effect of visual framing on user intentions

The analysis also evaluated the effect of different visual framing on user intentions. A Mann-Whitney U test showed that there was no statistical difference (p>0.05) in users’ self-rated intentions to learn more about CRC screening options and discuss CRC screening options with a doctor between the conditions—far and near.

A preliminary analysis was conducted to understand the influence of design iterations on the user perceptions of VH’s visual design. The analysis of the data from open-ended responses of the post-questionnaire survey assigned to the VH condition shows that only 10.8% of the 748 user comments focused on negative concerns related to the VH’s appearance. The analysis shows that when asked about thoughts on virtual interaction, only a minority of users chose to provide a negative comment on VH’s appearance.

5.7 Limitations for study 2

The limitations of the analysis include: (1) Users were chosen from a Qualtrics user pool that fit the demographics of our target population. These participants are often hired to complete similar surveys and may be more familiar with the online questionnaire format than general users. (2) The research team did not follow up with users to verify if they discussed CRC screening with a healthcare provider. This limitation is addressed in our ongoing clinical study, where the research team is distributing the intervention via an existing healthcare portal. The clinical study is designed to allow users to order screening tests after the intervention.

6 Linguistic characteristics of the VH communication

In our analysis, VH communication refers to the VH dialogs in the scripted conversation to deliver the intervention. Developing VH communication for health interventions often involves physicians to accurately present health-related information. Therefore, the initial version of the VH communication was developed in collaboration with communication scientists and physicians. The initial version of VH communication was then refined iteratively with the input from physicians, communication scientists, and community members to improve the flow and understandability [16].

Since experts from multiple disciplines and community members were involved in the iterative development of VH communication, the research team wanted to understand the differences in linguistic characteristics of VH communication before and after the user-centered iterative design process. Comparing linguistic characteristics of VH communication before and after the iterative development process can help us quantitatively evaluate the adapted intervention developed using feedback from multi-disciplinary experts and community members. Based on our analysis, VH developers can choose to involve additional members other than physicians, such as community members, to improve different aspects of VH communication. To quantitatively understand the linguistic characteristics of our VH communication, LIWC and readability analysis were performed. LIWC and readability analysis were performed on two VH communication scripts: (1) before adaptation and (2) after adaptation.

6.1 Method

To understand the linguistic characteristics of VH communication, a linguistic analysis was performed using the Linguistic Inquiry Word Count (LIWC) software [48]. As explained in Table 4, different LIWC dimensions are linked to outcomes such as trust in messages, adherence to health recommendations, health information recall, etc. Therefore, the research team wanted to compare and contrast the VH communication with physician communication across different LIWC dimensions. LIWC analysis was conducted across nine dimensions: Total words, Singular first-person pronouns, Plural first-person pronouns, Past tense words, Present tense words, Future tense words, Positive emotion words, Negative emotion words, and Cognitive process words. The chosen dimensions are based on the work conducted by Falkenstein et al. to identify the linguistic characteristics of physician communication using LIWC [9]. Falkenstein et al. analyzed physician-patient conversations focusing on specific procedures, such as colon surgery, breast surgery, etc. Thus, the physician conversations were similar to the focused conversation that was facilitated in our intervention.

To evaluate the readability of the language used in VH communication, a readability analysis was conducted using the Flesch-Kincaid Grade Level test. The Flesch-Kincaid Grade Level test was chosen to compare our results with the readability of online CRC information from prior work [2] and recommended guidelines from the USDHHS [1].

VH communication contained conversation elements which directed users to choose non-intervention elements like—“First, did you receive this invitation in your MyChart account, or are you a caregiver accessing MyChart for someone else?”. Such conversational dialogs were removed from the analysis.

6.2 Results

The results from the LIWC analysis across 9 chosen dimensions were compared with the results of prior work in physician communication [9]. The summary of results from prior work and our results across 9 LIWC dimensions are presented in Table 4.

6.2.1 Similarities and differences in VH and physician communication

Similar to overall patterns observed in physician communication [9], our analyses revealed that VH communications (both before and after adaptation) were generally in the present tense with a positive tone and the use of cognitive process words was common. The percentage of words across different LIWC categories for VH communications were within the standard deviations from the physician’s mean percentages. A large difference was observed in total words and the use of cognitive process words. Cognitive process words were used less percentage of times in VH communications (before adaptation - 11.21% and after adaptation—12.02%) compared to physicians (17.57 ± 4.02%). The total number of words was high in VH communications (before adaptation—1186 and after adaptation—1364) compared to physician communication (499.45 ± 378). Also, the percentages of plural first-person pronouns were very low in VH communications (before adaptation—1.01% and after adaptation—0.81%) compared to physician communication (2.23 ± 1.62%).

The similarity of VH communication with physician communication across the majority of LIWC dimensions could be because physicians were involved in developing the VH communication. The finding does not necessarily mean that the VH developers should aim to develop VH communication similar to physician communication. VH developers should use the prior work in the linguistic analysis of physician communication to understand the influence of linguistic characteristics for VH communication on health-related outcomes. For example, the usage of plural first-person pronouns was very low in VH communication compared to physician communication. Since the use of plural first-person pronouns is known to build rapport in conversations [45], there is a potential to improve the use of plural first-person pronouns in VH communication to build rapport.

6.2.2 Effects of the user-centered design process on VH communication

The linguistic analysis highlighted the difference in the use of positive emotion words before and after the adaptation of VH communication. The use of positive emotion words increased by more than 1% after the adaptation (3.67%) compared to before adaptation of VH communication (2.61%). The increase in positive emotion was also reflected in another LIWC dimension called Emotional Tone. The score for Emotional Tone increased after adaptation (74.86) compared to before adaptation of VH communication (52.67). The high number for Emotional Tone is associated with a more positive and a low number reveals greater sadness. A number of around 50 suggests either a lack of emotionality or mixed emotions [38]. The scripts were similar to each other before and after the adaptation in other characteristics.

The Flesch-Kincaid Grade Level test readability score for the VH communication after adaptation was reduced to 7.196 from 8.039 observed in VH communication before adaptation. The Flesch-Kincaid Grade Level test scores indicate that VH communication readability improved after adaptation and was understandable by a seventh-grade average student in the United States. The user-centered iterative development led to a positive increase in emotional tone and improvement in the readability of VH communication.

7 General discussion

The results from study 1, study 2, and the linguistic analysis of VH communication were used to derive the following design guidelines to inform developers of VH-based health interventions.

7.1 Use organization branding for building trust

To increase the trustworthiness of online health information, consumer health portals use measures like source disclosure, ownership disclosure, third-party seals, and branding to build trust with users [32]. Similar to consumer health portals, ALEX used a logo of a local healthcare provider to build trust with users. And as expected, users found information from ALEX credible and trustworthy because it was associated with a local healthcare provider (refer Sect. 4.5.1 for results). Based on our observation and practices followed by other consumer health portals, we recommend the use of third-party seals and the branding of the credible source to establish credibility in the information provided by a VH. Employing branding measures would benefit the VH-based interventions that are deployed in the general population outside of a controlled lab setting.

7.2 Manage expectations associated with the VH role

The analysis of focus group transcripts suggests that the users expected the virtual doctor to look like a real doctor (refer Sect. 4.5.2 for result). The finding aligns with recent prior work [37], where VH with role-appropriate attire was favored by users when discussing medical information. Our finding extends the quantitative evidence from the prior work by providing qualitative evidence on why users prefer role-appropriate attire for VHs. Based on our observations, the users’ expectations for a virtual doctor to dress in role-appropriate attire originated from the users’ prior experiences with doctors. Users related their prior experiences of observing doctors in professional attires. Given this observation, we recommend VH developers to understand users’ prior experiences with real-world counterparts to manage expectations associated with the VH role. This understanding can be captured through focus groups and interviews with potential users.

7.3 Use focus groups to identify how to make VH messages personally relevant to users

The focus groups identified how to structure the VH’s message such that the user intended to discuss CRC screening with a doctor. Users’ reactions, along with the quantitative results (refer Sect. 4.4 for results), highlight the importance of the user’s self-identification of the relevance of the VH information to influence behavior.

7.4 Delivering health information as a conversation with an animated VH increased the users’ intention to pursue the health topic further

The users reported a higher intention to discuss with a doctor in the Anim-VH condition than both the StaticVH-Txt and CTRL conditions. The users also reported a higher intention to learn more in the Anim-VH condition than the CTRL condition. Thus, if the goal of an application is to have users pursue more information on the topic, use an animated VH.

There are multiple factors, such as animations, audio, and interactivity, that differ between the Anim-VH, StaticVH-Txt, and the CTRL condition. Each individual factor could have influenced the user intentions in our analysis. As suggested by prior literature, content-related animations and voice can focus people’s attention on information [29] and influence user motivation [3]. However, the impact of individual factors cannot be individually accounted for using the current analysis.

7.5 Include feedback from community members during the development of VH communication

The process of iterative development with feedback from multi-disciplinary experts and community members led to a change in the emotional tone of the VH communication. VH communication was more positive in emotional tone after the feedback. The use of positive emotion words is known to increase: patient trust in physicians [34], better recall of information [19], and adherence to recommendations [28].

The feedback also led to an improvement in the readability of VH communication from eighth grade-level to seventh-grade-level. The readability of the VH communication was an average difficulty reading level compared to the existing online CRC information, which is largely high difficulty reading level [2]. Based on our findings, we recommend VH developers to include multi-disciplinary experts and community members, along with physicians, during the development of VH communication.

Further, VH developers can use linguistic analysis during early phases of development to improve VH communication at the level of individual word use that can influence the outcome of the intervention.

8 Conclusion and future work

The design guidelines from the focus group and online study can be used by future researchers who intend to use a VH-based intervention to influence users’ intentions.

The analysis of the focus group transcripts from 73 users results in the following guidelines to the visual design when using a VH-based intervention: manage expectations of the VH by testing different VH roles, use organizational branding to build user trust, and use focus groups to identify how messages can be most relevant to users.

The analysis of the 1400 online users post-experience survey results in the following guidelines: use animated VHs to influence the users’ intention to pursue the health topic further. After implementing the design guidelines from the focus groups, very few users commented on the VH’s appearance when asked about their thoughts about the experience.

The research team has integrated the application in the healthcare system to evaluate the VH-based intervention. The VH-based intervention is currently being evaluated in a randomized clinical trial.