Improving VIP viewer gaze estimation and engagement using adaptive dynamic anamorphosis
Introduction
Gaze has been shown to matter for a number of aspects of social interaction, such as deictic referencing and viewers engagement (Ens, Lanir, Tang, Bateman, Lee, Piumsomboon, Billinghurst, 2019, Roberts, Rae, Duckworth, Moore, Aspin, 2013). When presenting virtual agents or characters, modelling and delivering gaze targets accurately is crucial. From entertainment and art to advertisement and information visualization, 2D displays, such as flat monitors, are ubiquitously used. However, this approach introduces severe limitations for an accurate preservation of gaze direction. Notably, 2D displays are associated with several powerful effects and illusions, most important is the Mona Lisa effect, where the gaze of the projected head appears to follow the viewer regardless of viewpoint (Al Moubayed, Edlund, Beskow, 2012, Hancock, Nacenta, Gutwin, Carpendale, 2009).
As reviewed in the next section, a variety of light field, 3D display hardware, or virtual reality systems can eliminate the Mona Lisa effect, and provide accurate gaze direction, but at greater cost in hardware (Jones et al., 2009) or limited to a single user (Lee, 2007). In this paper, we adapt and extend the previous dynamic anamorphosis approach (Arthur, Booth, Ware, 1993, Ravnik, Batagelj, Kverh, Solina, 2014), which adapts itself to the changing position of the observer so that wherever the observer moves, he sees the same undeformed image. We propose a dynamic adaptive anamorphosis method, where we render selected objects from the selected viewer (VIP viewer)’s perspective, while the rest of the scene is rendered with a fixed direction normal perspective. The VIP viewer can be selected for the anamorphic experience using either face recognition or some other distinguishable tracked feature used for identification (e.g., Fathi et al., 2012). The method can better preserve nonverbal cues, including gaze, and we show how it improves the viewer experience. Also, other viewers in the room can engage with the displayed scene, where only the selected objects will be slightly deformed from their viewpoint. Given that human perception has shown that the adjustment to oblique viewing is achieved before the contents of the image are interpreted (Man and Vision, 1982), we propose that subtle variations in perspective do not interfere significantly for the engagement of non-selected viewers whilst improving the VIP viewer’s experience.
We evaluated the effectiveness of our method by measuring the ability of viewers to accurately judge which target each character is gazing at, the ability to discriminate whether each character is looking directly into their eyes or not, and the level of engagement whilst watching characters performing movements. Results revealed that the accuracy of the estimation of object-focused gaze varies across different audience members (VIP viewer and non-VIP viewer) and viewer position. It further varies across different character locations. The clear trends are that estimations for the VIP viewer are always superior. There is little obvious difference between the experience for VIP and non-VIP viewers at the central position. We also looked at when characters placed at off-center locations of screen, which is not explicitly addressed by previous work. The farther away a character is from the viewer, the more discrepancy in estimation appeared. For mutual gaze, we found the VIP viewer could always distinguish between being looked at and gaze to one side of them. The non-VIP performed badly in this test and again only when they were off the central position, but importantly not worse than traditional 2D display. For estimating viewer engagement, we found the VIP viewer rated the character rendered in anamorphic perspective higher than the rest of the characters, but it does not interfere with the other viewers’ engagement. In a deployed entertainment scenario, the ‘VIP’ viewer may switch to another ‘VIP’ according to tracking of viewers and content of the experience. This demonstration and result thus motivates the further study in the context of an art installation or video conferencing.
The rest of this paper is organized as follows. In the next section, we review related work in the area of fish tank virtual reality, anamorphosis & non linear projection and eye gaze & engagement evaluation. Section 3 contains a description of the system implementation. The experiment is covered in Section 4. Finally, we present discussions of the results, conclusions and future work.
Section snippets
Fish tank virtual reality
Fish tank virtual reality (FTVR), where a stereo image of a three dimensional scene viewed on a monitor using a perspective projection coupled to the head position of the observer, was originally proposed with a single 2D display (Ware and Franck, 1996). The important finding of the original FTVR studies was a comparison of different visual cues. For a variety of 3D interactions, they found that while head-tracking and stereo cues together were best, head-tracking alone resulted in better
Method
We adapt dynamic anamorphosis (Ravnik et al., 2014) technique with offset perspective projection, and propose a nonlinear projection method. The idea is to construct a nonlinear projection of objects in a scene using multiple linear perspectives, particularly, a VIP viewer perspective (see Fig. 1(a)) and a normal or orthogonal perspective (see Fig. 1(b)).
In more detail, the off-axis perspective projection applies a skewed transform from the offset eye view point to the display’s plane (
Experiment
The purpose of the study was to show that our implementation can provide one VIP viewer greater eye contact and higher engagement, without interfering with the other viewers’ experience. We concentrated our investigation on three major aspects in the communication of attention: object-focused gaze, mutual gaze and user engagement. For each trial, while characters fixed their gaze, the participant was asked: Which object is being looked at? Are you being looked at? Or asked to rate engagement
Lesson learned & design recommendations
The most important lesson from this experiment is that the VIP guest results in an improved gaze and engagement estimation for one selected character. This is performed without sacrificing the other guests’ viewing experience. Our method could be used for creating interactive experiences in a conventional 2D display. For example, Turtle Talk with Crush, an interactive show at Disney California Adventure® Park where guests can chat live with crush the sea turtle. We could render the turtle (the
Conclusion
We have presented the design, implementation and evaluation of our adaptive dynamic anamorphosis technique. The contributions of this paper are twofold. First, we extend the idea of dynamic anamorphosis for multiple viewers simultaneously, rendered using non-linear projection as a selective combination of offset perspective anamorphic projection and normal perspective projection (Section 3). To drive anamorphic projection we need to know the position of the VIP viewer’s eyes, so that if the
CRediT authorship contribution statement
Ye Pan: Conceptualization, Methodology, Software, Formal analysis, Validation, Writing - original draft. Kenny Mitchell: Supervision, Conceptualization, Methodology, Software, Resources, Funding acquisition, Project administration, Writing - review & editing.
Declaration of Competing Interest
Authors declare that they have no conflict of interest.
References (41)
- et al.
Revisiting collaboration through mixed reality: the evolution of groupware
Int. J. Hum. Comput. Stud.
(2019) - et al.
A practical approach to measuring user engagement with the refined user engagement scale (UES) and new UES short form
Int. J. Hum. Comput. Stud.
(2018) - et al.
Effects of 3D perspective on head gaze estimation with a multiview autostereoscopic display
Int. J. Hum. Comput. Stud.
(2016) Geometrical basis of perception of gaze direction
Vis. Res.
(2006)- et al.
Taming Mona Lisa: communicating gaze faithfully in 2Dand 3D facial projections
TiiS
(2012) - et al.
The perception of where a face or Television ‘Portrait’ is looking
Am. J. Psychol.
(1969) - et al.
Evaluating 3D task performance for fish tank virtual worlds
ACM Trans. Inf. Syst. (TOIS)
(1993) - et al.
Feature-based image metamorphosis
ACM SIGGRAPH Computer Graphics
(1992) - et al.
Ryan: rendering your animation nonlinearly projected
Proceedings of the 3rd International Symposium on Non-Photorealistic Animation and Rendering
(2004) - et al.
Surround-screen projection-based virtual reality: the design and implementation of the cave
Proceedings of the 20th Annual Conference on Computer Graphics and Interactive Techniques
(1993)
High resolution virtual reality
SIGGRAPH Comput. Graph.
Social interactions: a first-person perspective
2012 IEEE Conference on Computer Vision and Pattern Recognition
Using the users point of view for interaction on mobile devices
Proceedings of the 23rd Conference on lInteraction Homme-Machine
Perception of another person’s looking behavior
Am. J. Psychol.
Definitions of engagement in human-agent interaction
International Workshop on Engagment in Human Computer Interaction (ENHANCE)
Telehuman2: a cylindrical light field teleconferencing system for life-size 3D human telepresence
SIGCHI
The effects of changing projection geometry on the interpretation of 3D orientation on tabletops
Proceedings of the ACM International Conference on Interactive Tabletops and Surfaces
Achieving eye contact in a one-to-many 3D video teleconferencing system
TOG
Telehuman: effects of 3D perspective on gaze and pose estimation with a life-size cylindrical telepresence pod
SIGCHI
Generalized perspective projection
J. Sch. Electron. Eng. Comput. Sci
Cited by (1)
Scaling VR Video Conferencing
2023, Proceedings - 2023 IEEE Conference Virtual Reality and 3D User Interfaces, VR 2023