1 Introduction

Assistive technologies (AT) of various kinds and complexities are utilized for rehabilitation within many fields and settings. As the global geriatric population continues to grow (United Nations and Social Affairs 2019), so does the market size and the need for new and innovative rehabilitation equipment and devices. However, according to (Schroeder 2007), health-related outcomes are, to a greater extent, determined by human behavior rather than the number of technical breakthroughs. Thus, the need for user-centered design strategies to explore different preferences in a vastly heterogeneous population, is as vital as the technical innovations that constitute methods of delivery.

Virtual reality (VR)-based rehabilitation is a field which has seen an increasing interest since 1996 (Keshner et al. 2019), and since the arrival of commercially available and affordable equipment, more rehabilitation interventions are being designed specifically to embed therapy within basic principles of game design, using VR of various degrees of immersion (Lohse et al. 2014; Proffitt and Lange 2015). Using immersive VR also brings along its own challenges. Most clinical VR applications to date are single-player applications, whether they are developed for motor skills learning (Levac et al. 2019), pain distraction (Hoffman et al. 2000) or simply offering a change in context for repetitive exercises (Bruun-Pedersen et al. 2016). The geriatric population is the largest group with rehabilitation needs, and older adults are usually identified as a vulnerable group when it comes to loneliness and social isolation (Cacioppo and Cacioppo 2014). Therefore, there may be unintended consequences when using a technology that arguably encapsulate the user in a highly secluded experience that may contribute conversely to a subjective experience of loneliness. Yet, while many studies have incorporated social aspects in rehabilitation to motivate higher adherence to training programs, the majority have relied on competitiveness rather than collaboration, cooperation or coaction game modes during sessions using serious or exertion games (exergames). Through several studies, Goršič et al. (2019a) demonstrated how both cooperation and competition may increase physical performance and motivation in rehabilitation programs (Goršič et al. 2017a). Meanwhile, competitiveness in games has been shown to decrease intrinsic motivation (Song et al. 2013), lead to feelings of stress (Goršič et al. 2017a) and increase aggressive behavior (Dolgov et al. 2014; Pereira et al. 2019). Competition can also be confronting and challenging for people undergoing rehabilitation, during which they are still coming to terms with their injury or impairment. The implications of the limited evidence for social VR-based rehabilitation (Nguyen et al. 2017), as well as potential negative consequences of competitiveness in serious games or exergames, support the need for further research into the effects of collaboration and positive social presence in VR-based rehabilitation applications. In this paper, we present a mixed methods usability- and user experience study, with end-user feedback on a collaborative VR-application, where players are encouraged to work together to complete a biking-based scenic ride, on a tandem bike in VR.

2 Background

While many digital exercise-oriented applications and games have been mediated with non-immersive digital technology, more studies are now embracing immersive VR-systems that use stereoscopic head-mounted displays (HMD) (Tieri et al. 2018). Although systematic reviews of evidence support the use of VR interventions in a range of neurological and orthopedic populations (Lohse et al. 2014; Fluet and Deutsch 2013; Laver et al. 2017; Moreira et al. 2013), very few studies have identified active ingredients, that clearly contribute to the efficacy of VR-based interventions (Levac et al. 2012; Skjæret et al. 2016).

In the context of rehabilitative interventions, outcomes and recovery are often contingent on the patient’s motivation. This scenario can be challenging, since rehabilitation often involves controlled, repetitive and sometimes painful treatments (Maclean et al. 2000; Burdea 2003). Indeed, highly motivated individuals, who recognize the importance of exercise programs which seek to improve function or quality of life (QoL), are much more likely to align themselves with the aims and methods proposed by the therapists (Maclean et al. 2000). Subsequently, amotivation has been identified as a challenge within many rehabilitation fields, including pulmonary rehabilitation to treat chronic obstructory pulmonary disease (COPD) (Bourbeau and Bartlett 2008; Salinas et al. 2011) and acute stroke (Maclean et al. 2000). In other domains amotivation is a challenge when seeking to counteract physical inactivity and sedentary lifestyles (Teixeira et al. 2012).

Many theories on motivation exist, each with roots in contemporary perspectives on human behavior, as well as psychological and physiological needs. While earlier operant theories assumed self-regulated behavior could only be achieved through external reward contingencies (Skinner 1965), later theories, such as self-efficacy theory (SET) (Bandura 1997) and self-determination theory (SDT) (Ryan and Deci 2000), align themselves on the basic premise that humans are agents of their own actions. Furthermore, SDT argues that humans and animals oftentimes engage in activities that have no apparent reward except the intrinsic reward of the activity itself (Deci 1971; Ryan and Deci 2020). Thus, SDT proposes a continuum, ranging from nonself-determined, i.e., amotivation (no intention to act), through a stage of extrinsic motivation, i.e., self-regulation due to external factors (and in some cases internal factors), toward a stage of self-determined, i.e., intrinsic motivation (acting for the perceived interest, enjoyment and inherent satisfaction) (Ryan and Deci 2000). In the context of sports and rehabilitation, a subtheory of SDT known as cognitive evaluation theory (CET) seeks to understand how humans experience motivation related to exercise and explain how external factors (e.g., rewarding stimuli, punishment and competition) affect intrinsic motivation (Ryan and Deci 2000). Moreover, SDT argues that human beings have three fundamental needs: (1) Autonomy, which is the sense of control over the consequences of one’s own behavior, i.e., an internal locus of control. Autonomy is undermined by the feeling of being externally controlled. (2) Competence, which concerns the sense of mastery of an activity that is both optimally challenging and recognized by others. (3) Relatedness, which is the desire to feel a sense of belonging and connection to other people (Ryan and Deci 2020). If any of these basic needs are undermined, it may be detrimental to motivation and wellness. It is primarily the benefits of relatedness that this paper focuses its attention.

2.1 The effect of social presence on self-regulation

The need for interpersonal relationships is a recurring theme in theories on human motivation. Maslow’s hierarchy of needs suggests that humans require loving and affectionate relationships (Maslow 1981), Baumeister and Leary proposed that belongingness is a fundamental human need (Baumeister and Leary 1995) and again, there is Deci and Ryan’s concept of relatedness (Ryan and Deci 2000).

While autonomy and competence arguably focus on the relationship between the individual and a task, relatedness introduces the interpersonal dimension. Older adults, in particular, are sensitive to social isolation and loneliness, which may lead to severe health risks, faster cognitive decline, and increased risk of Alzheimer’s disease (Cacioppo and Cacioppo 2014). Furthermore, the injuries or trauma that require rehabilitation can effectively impact existing social networks, for example following a stroke incident (Bernhardt et al. 2004). Despite being a known challenge, many inpatients report that they experience unmet social needs during hospital admission (Gill et al. 2016). Even if a patient experiences a sense of autonomy and competence, such sensations depend on the recognition from others. According to Peiris et al. (2012) this need for relatedness, to socialize with other patients, therapists or relatives, is a capable instillation of motivation which may even be more important than the content or amount of therapy. But how does that affect the intended outcome of the therapy?

That performance is regulated by the mere presence of others is well-established, and arguably one of the oldest topics within the field of social psychology (Guerin and Innes 1993). In 1898, Norman Triplett conducted a series of experiments on children doing the same bicycle task, which showed that they were more inclined to perform at a higher intensity, when in the presence of another child coactor as opposed to riding alone (Triplett 1898). The effect, which has since become known as social facilitation (Allport 1924), has been extensively studied by behaviorists and cognitive psychologists, each trying to demystify the effect (Zajonc 1965; Guerin and Innes 1993). However, it has still not been explained what exactly social facilitation is, or what causes it. It is an effect that stipulates that performance is decreased or increased in the presence of another person. It is not a higher-order cognitive process, as it is even observable in mammals and insects (Zajonc 1965), and at its core, social facilitation is the minimal condition for social behavior, i.e., the difference between performing an activity alone, or performing it with, or in the presence of another individual (Guerin and Innes 1993). Furthermore, previous studies have also demonstrated that the effect can increase exercise effort, when in the presence of virtual avatars that offer verbal encouragement (audience effect) (Edwards et al. 2018), as well as when competing against virtual avatars (Anderson-Hanley et al. 2011).

2.2 Social immersive exergaming

Games can be roughly considered to fall into core categories of entertainment, learning or serious games (designed for behavioral change, including training games) based on function, purpose, impact and outcome (Connolly et al. 2012). Exergames may be defined as a digital gaming experience where outcomes are achieved through physical exertion (Mueller et al. 2008). Exergaming has been applied to many rehabilitation contexts, due to its functionality of combining physical activities with game-inspired design mechanics.

According to several systematic reviews, exergames have been utilized to improve patient motivation and adherence (Kappen et al. 2019; Larsen et al. 2013; Reis et al. 2019), and also for elderly users, to maintain physical function, health, and independence (Skjæret et al. 2016). Specifically, it has been applied to motivate exercise adherence in nursing home residents (Bruun-Pedersen et al. 2014); improve the QoL of elderly people living with dementia (van Santen et al. 2018), and has been used to demonstrate improvement in balance performance (Bateni 2012); increase muscle strength (Larsen et al. 2013) and functional mobility as well as motivation (Smeddinck et al. 2015; Huang et al. 2019).

Social interaction and support have proved beneficial for conventional rehabilitation, motivation, well-being, program adherence, increased performance (Resnick et al. 2002; Hamari and Koivisto 2015; Feltz et al. 2011) and enjoyment (Ballester et al. 2012; Kaos et al. 2019; Goršič et al. 2017b). However, a recent systematic review of the effects of serious games on health, well-being and motivation for older adults (identified in the review as people over the age of 50), found only three (of 24) articles with a focus on social elements (Nguyen et al. 2017). Therefore, more research is required to investigate the impact of social interaction as a main contributing factor in exergames (Marker and Staiano 2015; Edwards et al. 2018).

Social games may be divided into categories affording coactive, cooperative, collaborative or competitive modes of play (Pereira et al. 2019). In a coactive mode, players perform tasks independently of each other; in the cooperative mode players work together to complete tasks in different roles; in collaboration mode, players have to work together to complete a mutual task while being in the same role, whereas competition victory in the game is contingent upon performing better than the other player(s) (Mace et al. 2017). However, the broad use of this terminology is inconsistently used, especially regarding the difference between collaboration and cooperation (Pereira et al. 2019). Studies investigating the social elements of collaboration, coaction and competition have demonstrated that both collaboration and competition have the capacity to increase physical performance and motivation in rehabilitation programs (Goršič et al. 2017a; Kaos et al. 2019; Goršič and Novak 2016; Goršič et al. 2017b, 2019a; Mace et al. 2017; Pereira et al. 2019; Anderson-Hanley et al. 2011). Yet, one study by Deci et al. argues that competition, under circumstances where the conditions are perceived as controlling, may decrease intrinsic motivation and may in fact undermine the feeling of relatedness (Deci et al. 1981; Reeve and Deci 1996). The attempts to beat another player for the sake of the end-result is inherently extrinsic and has been shown to decrease intrinsic motivation for non-competitive individuals (Song et al. 2013). Affirming this, competitiveness in games has been shown to lead to stress and aggressive behavior, which are highly undesirable factors in a rehabilitation context. Moreover, perceived competence can be undermined as well, if a player constantly tries, and fails to best another player (Przybylski et al. 2014). Dolgov et al. (2014) showed that the type of game-mode has an effect on post-game social interaction. In their study, they compared cooperative and competitive games, and showed that cooperative gaming leads to a significant higher level of subsequent, spontaneous helpful behavior (Dolgov et al. 2014). Similarly, Pereira et al. demonstrated that the collaborative mode promoted empathy toward the co-player to a higher extent than competitive or coactive modes (Pereira et al. 2019).

In relation to serious games or exergames for older adults, research suggests that competitiveness peaks around the age of 50, and then steadily declines (Mayr et al. 2012). Indeed, Gajadhar et al. (2010) found that the interaction of older adults engaged in coactive activities is centered more on helping and supporting each other, in contrast to younger participants who tend to favor competition (Gajadhar et al. 2008), suggesting that games for older adults should refrain from competition altogether, to increase enjoyment and obviate a fear of failure (Nap et al. 2009; Gajadhar et al. 2010).

2.3 Collaborative exergaming for virtual rehabilitation

VR-based rehabilitation applications based on immersive technologies such as a VR headsets or hand controllers tend to construct the training sessions around one or more of the following: placing the user in novel, interesting and enjoyable VR environments (de Bruin et al. 2010; Laver et al. 2017); providing the user with specific but meaningful tasks in the virtual domain, specifically tailored to produce certain real-world movements (Bisson et al. 2007; Rizzo et al. 2014); allowing regulation of task difficulty to suit the user’s individual capabilities or desired performance; or real-time multisensory feedback on user performance (Laver et al. 2017). Combining these aspects to VR-based rehabilitation may provide a dynamic and (compared to the real-world rehabilitation setting) extraordinary rehabilitation or exercise environment, and change users’ mindsets from a negative to a positive experience of the activity (de Bruin et al. 2010). A rehabilitation-oriented exergame design should inhabit a meaningful confluence of a virtual environment’s contribution to the experience of the exercise and the desired real-world behavior from the user. Regarding social interaction, it should be possible to succeed in designs following both cooperative (shared goal, different tasks with different roles) and collaborative (shared goal, mutual tasks and same role) game modes (Pereira et al. 2019). A cooperative game mode may inherently highlight inter-player differences. Collaboration arguably focuses more naturally on united effort, which is key for older adults (accounting for individual differences between players caused by different skill levels, motor and-or cognitive functioning) (Gajadhar et al. 2010). A collaborative design may also minimize explicit consequences of noticeable differences in performance between players. By contributing with the same type of input to a shared pool, one shared sum of both will determine their performance, instead of one player reaching one goal and another not reaching it.

The study described in this article was conducted to investigate a virtual rehabilitation experience that aimed to affect the users’ motivation, engagement and performance through a collaborative task. The exergame was designed as a virtual tandem bike experience, where the shared task was the joint workload required to complete a biking challenge together. We investigated this collaborative VR-system with clients and carers or family members attending outpatient therapy to explore motivation, engagement, usability and client perceptions of the collaborative aspects of the VR application.

3 Methods

This study introduced a novel way to facilitate social interaction for VR-based rehabilitation purposes, in the context of immersive VR-based biking. The fundamental game design was a two-user biking challenge that placed users together on a shared virtual tandem bike, situated in a high-altitude mountainous environment on a gravel path (see Fig. 1). The task for the users was to travel the virtual gravel path, which formed a looping circuit. The circuit itself was comprised of a collection of elevations over its mountain top area, with all elevations differing in height and length. Depending on the sloping angle of an ascent, the virtual tandem bike decreased its forward speed per real-world (bike) cadence, which simulated the typical biking situation of ’gearing’ down to better manage a steep ascent. The challenge was manageable for all participants, as the physical resistance on the exercise bikes were adjusted to accommodate individuals’ fitness levels. The physical bike resistance was not changed during a session.

The mountain environment was chosen for its dramatic qualities of scenic presentation, and the dynamics of its (also dramatically) steep hills’ was hypothesized to enable accessible conversation topics for users unfamiliar to each other. The environment had a rich scenery with flora (grass, flowers, rocks, wooden logs) and fauna (various birds), which users shared the experience of. Additionally, the elevations of the bike ride appeared quite steep and seemed both impressive and imposing. Users shared the experience of anticipating these ascents, conquering them together, and enjoying the rewards afterward (the easy descent). Given the required effort and considerable physical stature of many elevations, overcoming them successfully would presumably provide a sensation of joint accomplishment in users, further identifying it as a shared collaborative effort.

Fig. 1
figure 1

The VE, path and tandem bike in the foreground, about to descend a hill. Participant avatars are visible as two floating bicycle helmets

The idea of the virtual tandem bike was produced in a co-creation workshop with physiotherapists (Høeg et al. 2020). The tandem bike is recognizable to most users and encouraged a shared goal in the game. If individual virtual bikes were offered, it would permit the users to compete with each other, which would undermine the endogenous concept of collaboration. Each individual performance contributed to one shared (tandem) speed, and by design, it was not made explicit how much each user contributed comparatively, as that was not the goal. In addition, the location on a two-person row on the bike allowed the experience to be shared with users in close proximity to each other in the same virtual domain.

Users of VR-technology sometimes experience side effects, which are important to address when designing interventions. Discomfort arising from VR is more frequently referred to as cybersickness, VR-sickness or virtual reality-induced symptoms and effects (VRISE) (Cobb et al. 1999; Sharples et al. 2008). In this study we introduced a bike-based VE, and it was presumed that the forward momentum would cause the user to experience a sense of vection (i.e., the illusion of self-motion). Additionally, the more frequent exposure to changes in velocity was more likely to lead to visual-vestibular sensory mismatch and thus a higher risk of cybersickness (LaViola 2000). It was therefore necessary to use a high-fidelity system with low position tracking errors and low latency, and a high visual display frame rate (fps) (LaViola 2000). Other studies have also reduced cybersickness by manipulating perception, e.g., by using rest frames, a visual stationary reference that offers inertial cues (LaViola 2000). Examples include simple geometry (Nguyen-Vo et al. 2018), or more complex objects, such as the presence of a virtual nose (Whittinghill et al. 2015).

3.1 Study design

The study was approved by the Southern Adelaide Clinical Human Research Ethics Committee (application ID: BL02609). Informed written consent was gained from all study participants prior to data collection. The study was designed as a concurrent mixed methods usability study in which current or discharged outpatients from a public rehabilitation outpatient service in Adelaide, Australia, were asked to participate. Outpatients meeting the inclusion criteria were identified and invited to participate in the study by a clinician. The primary aim was to evaluate the usability and client perceptions of the collaborative VR-application, used in conjunction with an upright or recumbent exercise bike. As such, feedback on the VR-system was gathered from participants and their biking buddies (not from members of the research team). Another main interest was the collaborative VR experience, for instance, of the usefulness for patients of exercising, when done together in a shared virtual space. Also, how they perceived themselves and the other person in VR, and how participants understood the interaction between users, and the levels of motivation and engagement that they experienced from the collaborative interaction in VR. Also of interest, was, the perceptions and views of participants using the VR biking system, and their reactions to the VE design and avatar representation. This includes self-perceived embodiment and the perception of the co-player. Additionally, users were asked whether the appearance and ’gearing’ effect of the virtual hills encouraged participants to bike harder. In terms of feasibility, participants were asked about perceived barriers and facilitators to using the VR biking system and whether they experienced any side-effects (cybersickness).

3.2 Materials and apparatus

The VE was created using Unity3D version 2019.3.0f5, and the EasyRoads3D asset tool was used to create virtual paths (Unity Asset Store 2019a). The tool allowed for a quick creation of a road loop, which was very malleable and adaptive to changes. The road measured a total of 1930 m, in which one lap would take approximately 10 min to complete at a speed of 12 km/h. The participants’ pedaling rate was measured with a MPU9250 Inertial Measurement Unit (IMU) from an ESP32 TTGo TTAudio 1.6 development board that was placed together with a 3.7V 2000 mAh battery in a custom enclosure under the pedal. The IMU offers 9 degrees of freedom (DOF), measuring acceleration, orientation and the magnetic field of the earth. The sensors were connected to Unity via Bluetooth classic protocol (Unity Asset Store 2018b), broadcasting seven separate measurements: 3-axis acceleration, 3-axis orientation and the instantaneous acceleration in 3D space. For the VE application only the 3-axis orientation was used to determine the rotation of the pedal in real-time. Within Unity, the receiving data was passed through a Kalman Filter (Kalman 1960), to remove noise and artifacts. The system transformed the participant’s cadence into forward monument in the VE, with respect to the steepness of the road, so that the participant would have to increase cadence to maintain velocity during positive inclinations. This was done to examine to what degree audiovisual cues alone are able to make up for the absence of rolling resistance. Acoustics of road friction (AudioJungle 2019) and wind-whistling were added as continuously looping sounds which increased in amplitude, positively correlated with biking speed. To simulate the gradient of the road, and resistance exerted on the bike due to gravity, sound effects were added in the form of gear change and increased tension to the chain and chain ring (Unity Asset Store 2018c). If the downhill speed of the bike was high, and angular velocity of the pedaling close to zero (i.e., the participant was not pedaling), sounds of the highly familiar clicking sound, caused by the freewheel pawls, would commence. The two computers were connected via wifi using Photon Unity Networking 2 (PUN2, (Unity Asset Store 2018d)), and the avatars were visualized by bicycle helmets (see Fig. 1), and a tandem bike 3D model (Unity Asset Store 2018a, 2019b).

The VR-equipment included two pairs of Oculus Rift Consumer Version 1 (CV1) head-mounted display (HMD), connected to two high-end gaming laptops. One of the computers was a Dell Alienware with 16 GB RAM, i7-7700HQ 2.80 GHz Processor and an NVIDIA Geforce GTX 1060 GPU, the other computer was a MSI GS65 Stealth with 16 GB RAM, an i7-9750H, 2.60 GHz processor and an NVIDIA GeForce RTX 2070 Max-Q GPU. The exercise bikes used in the experiment were both recumbent bikes (Motomed Viva 2) and regular upright exercise bicycles (SCIFITTM ISO7000).

3.3 Measurements

3.3.1 Demographic questionnaire

The demographic questionnaire provided information about participants, such as gender, age, highest education level, medical history, medication use, work experience, current or most recent occupation, thoughts on technology use in therapies (including advantages, problems, specific concerns).

3.3.2 Questionnaire for use of electronic devices and gaming habits

This questionnaire consists of seven questions and was developed by members of the research team to gather information about the use of and experience with commonly available electronic devices and common computer related activities. Participants were also asked to indicate the approximate number of hours spent using these devices and performing the computer-related activities each week. The questionnaire was used to gauge participants’ experience with computers and video games in order to ascertain whether extensive computer experience is a prerequisite for perceived usefulness and satisfaction with VR in therapy.

3.3.3 Simulator sickness questionnaire (SSQ)

Developed by Kennedy et al. (1993), the SSQ is a brief measure comprised of 16 questions used to determine the severity of simulator sickness. Although originally intended to measure motion sickness in military and aviation simulators, the SSQ is a popular tool to evaluate side-effects of VR-technology. The SSQ asks users to rank symptoms related to nausea (N), oculomotor disturbances (O) and disorientation (D) while using VR-systems, with ratings ranging from none (0), to slight (1), moderate (2) and severe (3) (Kennedy et al. 1993). Symptoms included in the SSQ are general discomfort, fatigue, headache, eyestrain, difficulty focusing, increased salivation, sweating, nausea, difficulty concentrating, blurred vision, dizzy eyes open, dizzy eyes closed, vertigo and stomach awareness (Kennedy et al. 1993). In this study we use the SSQ to assess symptoms from pre- and post-exposure. Aggregated subfactors (N, O, D) and total score (TS) are presented as pre-, post- and change scores. A total score of less than five is categorized as negligible symptoms. SSQ scores can be categorized as no symptoms (0), negligible symptoms (<5), minimal symptoms (5–10), significant symptoms (10–15), concerning symptoms (15–20). Anything with a score higher than 20 is considered a bad simulator (Stanney et al. 1997).

3.3.4 System usability scale (SUS)

The SUS is a 10-item Likert scale designed by John Brooke in 1989 as a simple measurement tool to provide a subjective assessment of usability (Brooke et al. 1996; Lewis and Sauro 2009). The SUS has been found to be a valid and reliable measurement of AT in clinical and research settings (Meldrum et al. 2012). The SUS is comprised of various questions relating to the effectiveness, efficiency and satisfaction of a system whereby participants are asked to rate their satisfaction with the system on scale from one (“strongly disagree”) to five (“strongly agree”) (Brooke et al. 1996).

3.3.5 Motivation

Motivation was measured using Likert items from the intrinsic motivation inventory (IMI) designed for post-experimental use (4 items) and the subject impressions questionnaire (7 items). The questionnaire contained 10 items within the subscales of interest/enjoyment (4 items), effort/importance (2 items) and relatedness (4 items) (Ryan and Deci 2000). The statements were ordered randomly, and participants indicated agreement on a 7-point scale ranging from “not at all true” (1), “partially true” (4) to “very true” (7). The interest/enjoyment subscale is the only subscale measuring intrinsic motivation directly, while relatedness is used for studies with interpersonal interaction. An additional Likert item (“Performing this activity with another person made it easier”) was added to the questionnaire to inquire about the participants’ perception of performing the exercise activity with somebody else. The item was not from the IMI and therefore not included in the composite scoring.

3.3.6 Virtual embodiment questionnaire (VEQ)

The Virtual embodiment questionnaire (VEQ), developed by (Roth and Latoschik 2020), measures the user’s sense of embodiment based on three distinct factors: (1) acceptance of the virtual body (sense of ownership over body parts and acceptance of human likeliness), (2) agency over the virtual body (control of movements) and (3) change in the perceived body schema (appearance, weight, size and width) (Roth and Latoschik 2020). The 12 items measure the level of agreement on a 7-point Likert scale between 1 (strongly disagree) and 7 (strongly agree).

3.3.7 Interpersonal interaction (IPI)

To measure the amount and quality of the interaction between the co-players, we used the 8-item interpersonal interaction questionnaire for observers (IPIQ-O), developed and validated by Goršič et al. (2019b). The measurement tool is an observer rating scale which measures the (verbal, non-verbal and para-verbal) interaction between two players in terms of amount of conversation, balance of the conversation, valence (positive or negative emotions), game relatedness and overall mood on a rating scale ranging from 1 to 5 in various continuums (see Table 2) (Goršič et al. 2019b).

3.4 Cadence

To measure whether the virtual hill caused an increase in performance, the sensor’s angular velocity \(\omega \) (deg/s) was logged every frame along with current slope (measured in degrees). Since large interpersonal differences were observed, we included participant age and gender as predictors in an ordinary least square (OLS) regression. Due to signal noise in the raw sensor data, we used Brown’s simple exponential smoothing with a smoothing factor of 0.2 to denoise the input signal. The script update frequency operated at a fixed frame-rate of 50 fps, so that cadence, measured in rounds per minute (RPM), could be approximately derived by using: \(RPM \approx (\omega \times \text {fps} \times 60)/360\).

3.5 Data analysis

The data was analyzed using MATLAB R2018B and Microsoft Excel 2016 (v16.0), mostly for descriptive statistics including central tendencies and dispersion. The OLS regression analysis was performed on automated data captured to examine the relationship between hill slopes and the participants’ cadence with statistical significance. Demographic information (age and gender) was used as predictor variables. Figures have been generated in MATLAB with the MATLAB Toolbox from the Institute of Sound Recording (IoSR) (Hummersone and Prätzlich 2016).

3.6 Participants and procedure

Potential participants were invited to bring a friend or family member to participate in the study with them. Participants were given the opportunity to use the VR-system and interact in the VE with a co-located co-player (a friend, family member or another patient). Both participants were asked to complete a series of questionnaires and give feedback about their experience. The goal of this user-centered methodological approach was to capture feedback on the user experience, toward further improvement of the application, in preparation for the use of this technology as an intervention in a pilot trial. The eligibility criteria for the buddy-participant were that they had to be over the age of 18 years, caring for or related to a client receiving outpatient services, and able to undertake 10–15 min of cycling activity. If the participant declined interest or was otherwise unable to invite a buddy of their own choice, a member of the research team was assigned as their buddy. Potential client participants as well as potential buddies were excluded if they had a medical history with contraindications to using VR, e.g., a diagnosis of epilepsy, a recent history of seizures or severe visual impairments (blindness, partial blindness and visual perceptual problems). Furthermore, participants were excluded if they could not understand verbal or written English, were unable to follow instructions or had physical limitations that prevented them from using an exercise bike, e.g., due to cardio-respiratory complications, lower extremity amputations or recent surgery. Participants who had problems with balance were assigned to use a recumbent bike. Other participants would use an upright bike.

Out of 37 potential patient participants, 30 were eligible to participate, and 23 patients were approached to determine initial interest in participating (Fig. 2).

Fig. 2
figure 2

Flowchart of the recruitment process from screening to final inclusion. Four participants biked with a member of the research team, three participants biked with a friend or a family member, two participants biked with each other, and one participant dropped out after initially agreeing to participate

Participants were provided with a short introduction of the purpose of the study and their involvement, as well as potential side effects they might experience during VR-exposure. They were instructed to inform the researchers if they experienced any unpleasant physical symptoms or psychological discomfort. After providing informed consent, the participants answered a short demographic questionnaire, as well as a questionnaire on technology use and gaming habits. Before getting on the bike, the participants answered the SSQ questionnaire to establish a baseline measure (pre). Disposable hygiene covers were attached to the HMD before fitting straps and adjusting inter-pupillary distance (IPD) to provide the best experience for the participant. When both participants had signaled that they were ready, the application was started (Fig. 3).

Fig. 3
figure 3

A participant biking with a member of the research team on recumbent bikes

4 Results

A sample of eight patient participants and three buddy participants consented to partake in the study. Participant characteristics are summarized in Table 1, which shows that participants were predominantly male (64%), with a mean (M) age of 60 years (±11 years) and various principal diagnoses and different levels of education reported as International Standard Classification of Education (ISCED-11).

Table 1 Participant demographic information

4.1 Participant demographics

Forty percent of the participants rated their computer skills from very experienced (\(n=4\)) and somewhat experienced (\(n=2\)). Thirty percent of the participants rated their computer skills as neutral (\(n=3\)) and 10 percent rated themselves as having no experience with computers at all (\(n=1\)). Two participants reported using a desktop computer for personal activities (1.6±3.5 h per week). Five participants used a laptop computer for personal activities (5.5±6.9 h per week).

Ninety percent (\(n=9\)) of the participants reported using a smartphone for personal activities for between 2 and 30 h per week (11.4±10.1 h per week). Four participants reported using a tablet device between 1 and 10 h per week (3.0±4.4 h per week). The most common activities included emails (\(n=8\)), online shopping (\(n=8\)), online banking (\(n=8\)), general internet browsing (\(n=8\)), entertainment (\(n=7\)) and social networking (\(n=6\)). Other activities undertaken by participants included online chat interactions (\(n= 4\)), looking up health information (\(n = 5\)), listening to podcasts (\(n=5\)), reading news (\(n=5\)), education/training (\(n=3\)), video chat (\(n=3\)), video games (\(n=3\)), smartphone games (\(n=3\)).

Only two participants were currently working and they spent an average of three and 10 h, respectively, on a desktop PC for work; one participant used a laptop for work for a total of 35 h per week. Work activities completed on desktop computer and laptop included emails and general word processing tasks.

All participants, with the exception of one, completed the bike ride. Two pairs of participants decided jointly to continue the ride upon making a completed lap of the circuit, going for another lap. On average the participants biked a distance of 2692 ± 731.4 m, which is 750 m longer than a single revolution around the lap. Participants biked for 10.6±2.6 min with an average speed of 14.6±5.3 kmph. Overall, the system performed well in terms of frame rate for the GTX1050 71.5±26 fps, and the GTX2070 87.9±2.5. The large standard deviation (SD) for GTX1050 is due to an inconsistency in one session which lead to an average frame-rate in that session at 22 fps.

4.2 Motivation

Fig. 4
figure 4

Motivation ratings for individual participants on IMI-subscales (satisfaction, effort and relatedness), based on seated position (front or back). Note that the rating scale is from 1 to 7, but only observed intervals between 5 and 7 are shown here. The red line indicates median value, and the black ‘x’ is independent samples. The inner box shows the interquartile range (IQR), and the density plot indicates distribution of the data

Motivation scores were high in all categories: interest/enjoyment (6.5 ± 0.7), effort/importance (6.1 ± 0.7) and relatedness (6.4 ± 0.6). Specifically for interest/enjoyment, when asked if they enjoyed doing this activity, one participant answered 5, three participants 6 and seven participants answered 7 (“very true”). For effort/importance, when asked if it was important for them to do well at the task, two participants answered 4 and 5, three participants answered 6 and six participants answered 7 (“very true”). Noticeable differences in motivation scores were observed based on seating position. Participants in the back of the virtual tandem bike generally rated the experience higher than those sitting in the front (see Fig. 4). Results for \(\hbox {enjoyment}^\mathrm{front}\) (6.2 ± 0.7) / \(\hbox {enjoyment}^\mathrm{back}\) (6.8 ± 0.6), \(\hbox {effort}^{front}\) (5.8 ± 0.7) / \(\hbox {effort}^\mathrm{back}\) (6.6 ± 0.4) and \(\hbox {relatedness}^\mathrm{front}\) (6 ± 0.5) / \(\,\,\,\,\,\,\,\hbox {relatedness}^\mathrm{back}\,\,\,\,\,\,\) (6.9 ± 0.3).

When asked if performing this activity with another person made it easier, four participants answered 5/6 and seven participants answered 7 (“Very true”).

4.3 Interpersonal interaction

The IPI observation tool measured the amount of conversation, balance, valence and game-relatedness and the results are reported in Table 2. Although the IPI varied between participants in terms of quantity, it was seldom completely absent. The conversation was mostly balanced between players and with positive valence. There was a slight difference in the balance based on seat position. Generally participants in the back talked more (balance: 3.8 ± 1.4) than those in the front (balance: 3.3 ± 1.3). Nearly all participants only discussed things related to the game, and the overall mood was very positive. The exception was participant P06 who felt sick after a few minutes and had to quit biking. Before stopping, the IPI between the players was very unbalanced. While participant P07 conversed positively (positive valence), about things game-related, the responses from P06 were more averse (negative valence).

Table 2 Different subscales of the IPIQ-O along with the associated scale continuums. The calculated mean±SD for all participants are shown in the rightmost column

In the beginning of one session, the conversation focused on copresence as one participant asked the other: “are you also in here with me?” to which the other participant would respond with a “here!” and raise the hand and wave. This gesture, however, was not reproduced in the VE. A difference between buddy-type was also observed between participants biking with a researcher versus family-member or a friend, both in amount of conversation (\(M=2\) vs. \(M=4.1\)) and balance of conversation (\(M=2.5\) vs. 3.8).

4.4 Virtual embodiment

As for VEQ scores related to acceptance of the virtual body (OW), the majority either strongly agreed (34%), agreed (32%) or somewhat agreed (9%) that, e.g., it “felt like the virtual body was my body”. As for agency (AG) a majority either strongly agreed (34%), agreed (29%) or somewhat agreed (14%) that, e.g., it “felt like I was controlling the movements of the virtual body”. Finally regarding Change (CH) a majority responded that they strongly disagreed (52%) or disagreed (11%) that, e.g., it “felt like the form or appearance of my own body had changed.”. The mean VEQ-scores for all participants for OW (5.6±1.3), AG (5.6±1. and CH (2.5±1.6). Figure 5 shows the central tendencies of responses in a box plot with kernel density plots to indicate distribution of the data.

Fig. 5
figure 5

Boxplot with kernel density plot for VEQ scores (Roth and Latoschik 2020): Ownership (OW), Agency (AG) and Change (CH). The red line indicates median value, and the black ’x’ is independent samples. The inner box shows the IQR, and the density plot indicates distribution of the data

Fig. 6
figure 6

Distribution of responses for the subscores: nausea (N), Oculomotor (O), disorientation (D) and total score (TS). SSQ questionnaire was distributed both before (pre) and after (post) the test. The red line indicates median values, and the black ’x’ is independent samples. The inner box shows the interquartile range (IQR), and the density plot indicates distribution of the data

4.5 Usability

Overall, participants strongly agreed (45%) or agreed (36%) that they would like to use the system frequently, whereas one person disagreed and another one was neutral. None of the participants found the system unnecessarily complex, and 9% agreed that it was easy to use and 91% strongly agreed. Participants did not agree on the technical skills required to operate the system. When asked if they thought they needed the support of a technical person to be able to use this system 27% strongly disagreed, 18% disagreed, while 18% agreed and 36% strongly agreed. 27% answered that they felt confident about using the system, and 73% strongly agreed. The lowest calculated SUS-score was given by P06 (67.5), and the highest scores were given by P08 and B08 (100). The average score was 85±5.

4.6 Cybersickness

SSQ-scores were calculated based on official guidelines (Kennedy et al. 1993). Sub-scores for symptoms of nausea (N), oculomotor (O) and disorientation (D) are reported as pre/post-test mean scores (± standard deviation), as well as change scores. \(\hbox {N}^\mathrm{pre}\) (6.1±9.8), \(\hbox {N}^\mathrm{post}\) (25.4±16.1), \(\hbox {O}^\mathrm{pre}\)(4.1±5.2) and =\(^\mathrm{post}\) (11.8±=11.3), \(\hbox {D}^\mathrm{pre}\) (5.1±9.4 ) and \(\hbox {D}^\mathrm{post}\)(9.3±13.6 ), and \(\hbox {TS}^\mathrm{pre}\)(4.8±3.8) and \(\hbox {TS}^\mathrm{post}\) (14.5±9.2) (see Table 3).

Table 3 SSQ scores for all subscales: nausea (N), oculomotor (O), disorientation (D) and total score (TS) reported as mean±SD in pre-, post- and change scores

The SSQ measures symptomatic effects of motion sickness, which includes sweating as an early onset. For applications such as this, however, which includes an element of exercising, sweating is a desired outcome, and therefore sweating (in the Nausea subscale) is considered a confounder. Therefore, the SSQ results are shown both as pre, post and post (adjusted). The adjustment has been made by retaining item (7) at the pre (baseline) level (see Fig. 6). Furthermore, noticeable difference in symptoms were found for participants who were situated in front of the bike \(\hbox {TS}^\mathrm{post}\) (17.5 ± 9.1) and (change score): \(\hbox {N}^{front}\) (27 ± 12.7), \(\hbox {O}^\mathrm{front}\) (6.3 ± 14.7), \(\hbox {D}^\mathrm{front}\)(13.9 ± 12.5) and \(\hbox {TS}^\mathrm{front}\) (12.5 ± 6.5) opposed to be sitting on the back \(\hbox {TS}^\mathrm{post}\) (7.5 ± 6.5) and (change score): \(\hbox {N}^\mathrm{back}\) (5.7 ± 5.2), \(\hbox {O}^\mathrm{back}\) (0 ± 5.4), \(\hbox {D}^\mathrm{back}\)2.8 ± 6.2) and \(\hbox {TS}^\mathrm{back}\) (2.2 ± 5).

4.7 Cadence and slope

The OLS model predicts the probability that the hill slope (independent variable) affects cadence (dependent variable), and presented in Fig. 7) and Table 4 for both individual participants and a merged dataset for all observed data points (\(N = 5180\)). Data for participants P01 and P02 was missing due to a flaw in the code, which was not discovered in the first two trials. As shown in Fig. 7 there are large interpersonal differences in achieved cadence, based on the intercept alone (baseline coefficient when the slope is 0 degrees). Similarly, there were differences in how much each person was affected by the slope, but there was a significant increase (\(\alpha \) \(\le \) 0.01) in cadence for all participants individually and combined (0.43 increase in RPM per degree increase in slope).

Fig. 7
figure 7

Exercise effort of the participants measured in approximated cadence (RPM). Both sensor data and terrain slope (degrees) were logged every second in the system. The intercept (in Table 4) is the y-intercept of cadence, which approximately indicates the speed at a 0-degree surface slope. The fitted model indicates a positive trend, i.e., that all participants accelerated their cadence as the slope increased. The figure also shows the great heterogeneity in the participants’ cadence

Table 4 Ordinary least square (OLS) regression model predicting probability of hill slope (independent variable) affecting cadence (dependent variable), measured in RPM for individual participants ( = female) and combined regression score for all participants. Intercept is the baseline coefficient value, i.e., (\(x=0\)), slope is the slope coefficient, i.e., increase in RPM per angle \(\hbox {degree}^{\circ }\) incline. Age has been added as a predictor, and gender as a dummy variable (0/1) to explain interpersonal differences

4.8 Post-test interview

When interviewed about their experience, participants overwhelmingly agreed that the experience was enjoyable and engaging:

I thought it was great ... It’s lovely and the scenery is beautiful ... I thoroughly enjoyed it (P5)

Participants discussed challenges that they had with their usual therapy sessions, including difficulty finding motivation to attend and to complete exercise sessions, feeling that the exercises were boring and tedious, and feeling as though time was moving slowly as they were watching the clock and counting repetitions or counting down minutes.

No. It’s like ... you know you have to do it, and you’re not motivated, and you don’t talk to anyone, really. You just get on there, do what you need to do. And it is tedious. And sometimes there’s days where you don’t want to come in. And I think that’s the biggest problem, is, yeah, motivation to keep going. (P01)

When asked what they liked about the experience, participants stated that they enjoyed it and it was fun: “All of it. It was really good. I can’t wait to tell my husband!” (P5) and “Well, I was looking forward to riding today because the idea interested me a lot and it fulfilled all of my expectations. In fact, it was better than I thought it would be” (P8). When asked, the majority of participants reported that there was nothing they disliked about the experience: “I can’t think of anything I didn’t like!” (P5).

All but one participant stated that they would like to use the system again as part of their therapy. The participant (P6) who stated they would not like to use the system again experienced dizziness, stating that the hills and corners within the virtual environment gave a sensation of being on a roller coaster. The reasons provided by participants who would like to use the system again included the concept of teamwork in therapy, having someone to share the experience with, and reducing the tediousness of cycling activities. Many participants compared the experience with their current biking therapy activity, stating that the system offered additional engagement and motivation compared to existing therapy:

[with existing therapy] you’re more or less, you know, other people are around, but you don’t interact with them. Whereas this here [the buddy biking system], interacting, you’re working together as a team. (P1)

Participants agreed that the Buddy biking experience encouraged collaboration and teamwork: “I enjoyed the collaboration” (P3) and “I liked having a co-player” (P5). The presence of a co-player encouraged interaction compared to cycling independently. One patient participant reported that they thought it would be more fun to complete the task as a competitive activity against another player. The rest of the participants, however, agreed that they liked the idea of collaboration over competition. Two patient participants agreed that the addition of a competitive element did not appeal to them “Well, I’m not competitive at all, so, I wouldn’t do it.” (P7) and “I don’t have a competitive bone in my body. No. It wouldn’t interest me at all.” (P6). Another patient / family pair were not interested in the addition of a competitive element “.. you’re talking to the wrong people. I think it’s very nice to have the cooperative, um, you know, the tandems are a great idea” (P8). When prompted for additional information, the pair stated “.... we’re in the wrong age bracket.” (P8), indicating that one participant was never very competitive and another stated that they were more competitive when they were younger, “I was pretty competitive but I’m retired from competition, haha.” (P8). For the buddy biking task the pair agreed “We’re trying to encourage each other to get to wherever it is” (P8).

The participants also mentioned that the VR-system enabled distraction and reduced the sense of time spent cycling. Most participants liked the scenery and three participants specifically talked about experiencing an increased sense of presence (the sense of being there) within the virtual environment: “... and it felt like part of, you’re out in the wilderness, just riding along. So yeah, I pretty much ignored that. So, to me it felt real.” (P01).

The most commonly reported areas for improvement in the experience were the perceived mismatch between the physical bike and the visual feedback in the VE, the inclination of the hills, and the view of the buddy rider. Participants reported a sense of mismatch between the physical bike and the VE in a number of ways. One participant could not specifically describe what led to the mismatch but stated “... but I felt disconnected from the bike for some reason” (P1). Other participants could verbalize the cause of mismatch as being the discrepancy between the resistance of the bike on an incline and said person’s body representation and posture in the VE. Three participants stated that they would like to have seen their virtual hands on the handlebars. Two participants reported wanting to be able to turn the handlebars of the physical bike in order to steer.

4.9 Therapist questionnaire

During data collection, the Buddy Biking-system was presented and demonstrated for physiotherapists at the clinic. A total of eight physiotherapists (57% women; avg. age 31±5; with 10±5 years of practice) participated. 50% were familiar with clinical use of VR and 66% had tried VR before. The therapists answered an open-ended questionnaire about their attitude toward using technology in therapy. Moreover, the questionnaire contained open-ended questions related to the therapists view on the perceived usefulness and relevance of the Buddy Biking system, as well as whether or not they believed it could be used to motivate patients. The overall feedback was positive, and the therapists expressed enjoyment with the experience, the scenery and the collaborative element. A few suggested that it might get repetitive with repeated use, and that mutual shared goals, tasks and feedback and rewards could be beneficial to retain motivation. When asked what patient populations they would suggest using it for, most answered patients who are “difficult to engage” that dislike cardiorespiratory training, pulmonary rehabilitation and endurance training. On the other hand, the therapists would not suggest using it for patients with dizziness, vestibular or cerebellar conditions.

5 Discussion

In this study we aimed to gather user-feedback on a novel VR-experience which allowed patients to collaborate with family members, friends or other patients. A central requirement was that interpersonal differences would affect the overall goal of completing the track and that the environment would facilitate only collaboration. Therefore, to remove competitive elements within the experience, we incorporated a tandem bike, rather than separate bikes. Two separate bikes would presumably instill a higher degree of competition between the players. Furthermore, the environment was deliberately designed to reinforce collaboration by having them overcome the rising slopes together. It was also interesting to observe to what degree the appearance of virtual hills would encourage players to increase their cadence, while simultaneously decreasing the speed. Yet the key aim was to evaluate how the participants reacted to - and experienced - the social elements.

The results of the study indicate that the participants overall understood the game mechanics and the social aspects while playing, mostly because they shared the virtual environment with the each other. The players reported that they were both challenged and engaged and enjoyed the social aspect as well as the virtual environment. All participants expressed that they found the social aspect enjoyable and motivating and that it was easier to complete the task when collaborating with a biking buddy, and most of them would use the program again given the opportunity.

Motivation, subjectively rated by participants post-test, was generally high in IMI-subscales: satisfaction, effort and relatedness. However, results may have been confounded by the novelty effect, and the retention of motivation could very well decline with repeated use. The IMI questions asked were selected and modified to fit the purpose. Although this is in compliance with official guidelines, the number of items should be extended to increase reliability. Some of the items should be removed in future studies as they were invalid in the context, e.g., one of the questions in relatedness was “I think it’s likely that this person and I could become friends”, which was irrelevant for participants who had an invited buddy (either friend or family), or for those biking with a member of the research team. Alternative validated tools could be used to measure different types of motivation, e.g., the Self-Regulation Questionnaire (SRQ) (Peters et al. 2018) and specifically for measuring basic psychological needs, e.g., Basic Psychological Need Satisfaction Scale (BPNSS) (Ryan and Deci 2000), as well as need frustrations (Chen et al. 2015).

Although the sense of sharing the task was apparent to most participants, the sense of sharing the virtual space was less than obvious. Most frequently the participants with lower IMI-scores (for relatedness), as well as less balanced conversation (IPI) were positioned on the front seat. This is most likely because they were constrained by the physical steering prop (the exercise bike), and could not turn around and see the participant behind them. The representation of the avatars using a floating bicycle helmet was not sufficient to gain a sense of sharing the space with a co-player. Future studies should explore how to improve the location and appearance of the coactors, for example by placing them next to each other on a side by side bicycle, as well as implementing more explicit conjunctive tasks that participants can complete together.

VEQ scores were generally high in OW and AG and low in CH (see Fig. 5), which implies that participants accepted the virtual body, felt that the control of their own movements were appropriately represented in the VE, and that their body schema had not changed. Since the participants experienced abstract types of embodiment in the form of bicycle helmets and the bike itself, we argue that it was likely that the participants were experiencing insufficient sensorimotor contingencies (SMC) (Slater 2009). However, to most of the participants this was their first experience with VR, and without knowledge of full-body immersive VR, which is capable of delivering a holistic and coherently embodied and perceptual experience, abstract avatars may be sufficient. In the post-test interview, a few participants did point out that it felt “weird” not having their hands visualized. Some participants also pointed out that they experienced what can be interpreted as an inconsistency in SMC, as they tried to lean on the bike to move their center of gravity, as the road was turning. The virtual bike did not respond to this behavior, but of course neither does the exercise bike. The comment is interesting, as it suggests that visual perception alone evokes a sensorimotor response as previously demonstrated in the famous rubber hand experiment (Botvinick and Cohen 1998). It can also be argued that embodiment is elicited through the mirroring of the other participant, i.e., he/she wears a bicycle helmet, so I am likely to do as well. Overall, on par with the tendency seen in SSQ and IMI scores, OW and CO was higher for participants situated in the back, and CH was lower.

Some participants were asked how long they thought they had biked for, and everyone underestimated the time passed (at most by 50%). The loss of sense of time has previously been described by Sanders and Cairns (2010); however, the effect is poorly studied in VR. One participant was aware that he had lost his sense of time in VR, and compared it to the therapy session earlier the same day:

When I was doing it [therapy] this morning, it was like, ’Oh, I got three more minutes!?’. I didn’t wanna finish, but with this I had no, there was no ’time’. It was ... just enjoying it. And so, when you sit, I did six kilometers. Um, yeah, that was a surprise to me, because I can only do up to four! - P01

A recent study investigated how time misperception can be manipulated by subtle environmental cues within VEs, attributing the effect to increased cognitive load and system immersion (Schatzschneider et al. 2016). Another recent study suggests that time compression effects may be caused by altered affective states rather than the addition of immersive technology (van der Ham et al. 2019). For exercising and rehabilitation purposes, this effect is highly interesting, as it could potentially increase adherence and training output due to a prolonged exercise duration. Furthermore, as remarked by P01 above, the realization that the actual exercise capacity is greater than one’s perceived capability could increase the sense of competence upon realizing ones true potential.

Although the tandem bike, hampers the option of competing within the VE, competition cannot be said to be completely eliminated. The bicycle task, albeit on a tandem bike, may still arguably be a conjunctive task, i.e., the outcome is determined by the least capable participant, which in turn could lead to a performance increase by the inferior member to avoid appearing weak (Weber and Hertel 2007; Feltz et al. 2011). Although sight might be obstructed by the HMD, auditory feedback from the exercise bike (increasing cadence) or verbal encouragement (which was observed) may encourage the other person to pick up the pace as well. This is not necessarily detrimental, as intrinsic motivation is only undermined by competition when it is involuntary (Deci et al. 1981; Reeve and Deci 1996).

For social psychology the idea of a VE centered around tasks on a tandem bike may also be relevant. Early studies into social facilitation found that competition is extremely difficult to contain in coaction studies, and many of the early studies are confounded by this effect (Guerin and Innes 1993). It can be very difficult to get two people working on the same task without them trying to compare performance levels (Triplett 1898). We believe that VR has the potential to reduce some of these confounders, especially when constrained to tandem bike collaboration.

It has previously been demonstrated that exposure-duration, frame-rate (Murata 2004) and visual optic flow (Chen et al. 2016) affects the amount of cybersickness experienced. The use of HMDs frequently results in higher SSQ scores, and even more so when vection is included (e.g., during locomotion). The reported levels of discomfort measured with the SSQ were generally lower than anticipated. This may be partly due to design choices made, i.e., having a static reference in the form of the bicycle and helmet and having stable changes in velocity. Compared to previous studies using VR and exercise bikes, the SSQ-scores reported in this study were lower (Katsigiannis et al. 2018; Mittelstaedt et al. 2018). In one session the frame-rate was subsequently determined to have decreased to an average of 22fps. However, this was not detectable in SSQ scores nor was any discomfort verbally reported by the participant. Surprisingly, SSQ scores had a >10 increase in change scores for participants situated in the front. The sample size and general lack of an experimental design means that we do not and cannot infer causation. However, we speculate that the difference observed may be attributed to the fact that the rear-seated participant can see more of the bicycle and the other participant’s avatar, which in turn serves as a stable reference. It is an interesting result and may be indicative of a general phenomenon in favor of the stable reference theory (LaViola 2000; Whittinghill et al. 2015), and should be explored in future studies. Our decision to adjust SSQ scores may have undermined the validity of the score itself. Nevertheless, we reasoned that it was necessary to eliminate the confounding effect, i.e., that sweating was a direct symptomatic response to exertion rather than an early onset of nausea (Kennedy et al. 1993). The SSQ is a standardized tool for measuring the amount of cybersickness in VR-applications, we argue that, in future studies it should be customized to validly measure physiological symptoms when utilized in the context of immersive exergames. SUS-scores were also acceptable. According to the acceptability rating for interpreting SUS-scores, anything lower than 50 is unacceptable, between 50 and 70 is marginal, and from 70 to 100 is acceptable, and ratings between 80 and 90 receiving an adjective rating of excellent (Bangor et al. 2009). However, measuring usability in this context is not without shortcomings, as none of the participants had to interact with the interface itself. If SUS-scores were measured in a situation where the participants had to operate the system themselves, it would not be surprising to achieve a marginal score instead.

For the detected cadence, it was observed that all participants were positively affected by slope (see Fig. 7 and Table 4). The OLS regression model indicates that the appearance of a virtual slope along with an explicit decrease in biking speed, affected the cadence (measured in rpm). On average all participants increased their cadence by 0.43rpm per degree incline. That means that the average cadence of 70rpm increased significantly by 4.3 rpm at a 10\(^{\circ }\) slope. Interpersonal differences were significantly explained by age, which caused a 1.46 decrease in cadence per year, and a 29.6 decrease in cadence for female () participants (see Table 4). The results are interesting as they suggests that audiovisual stimuli alone is sufficient to increase performance in the absence of kinesthetic stimuli (due to a lack of rolling resistance). This effect could potentially also be utilized to increase performance in interval training, e.g., for pulmonary or cardiac rehabilitation.

6 Conclusion

Our research provides initial support for the concept of tandem bike-based collaboration in immersive VEs, and the findings form the groundwork for future studies on the concept. All participants indicated that they found the social aspect enjoyable, and that it was easier to complete the task when collaborating with a biking buddy. Nearly all participants pointed out that they would like to use the system again if they were given the opportunity. Participants generally lost track of the exercise duration, biked with a higher cadence when presented with visually congruent feedback during inclinations. Real-world interaction between co-players varied in quantity and valence. However, the sense of copresence was limited by physical constrains and abstract avatar representation. Future studies should explore how to improve the location and appearance of avatars, as well as implement joint tasks to emphasize collaboration.