Improving VIP viewer gaze estimation and engagement using adaptive dynamic anamorphosis

https://doi.org/10.1016/j.ijhcs.2020.102563Get rights and content

Highlights

  • We propose a dynamic adaptive anamorphosis method based on non-linear projections.

  • Anamorphic rendering of a selective object with normal view rendering of the rest.

  • The VIP guest results in an improved gaze and engagement estimation.

  • This is performed without sacrificing the other guests’ viewing experience.

  • We discuss different viewpoints and the spatial relationship between objects.

Abstract

Anamorphosis for 2D displays can provide viewer centric perspective viewing, enabling 3D appearance, eye contact and engagement, by adapting dynamically in real time to a single moving viewer’s viewpoint, but at the cost of distorted viewing for other viewers. We present a method for constructing non-linear projections as a combination of anamorphic rendering of selective objects whilst reverting to normal perspective rendering of the rest of the scene. Our study defines a scene consisting of five characters, with one of these characters selectively rendered in anamorphic perspective. We conducted an evaluation experiment and demonstrate that the tracked viewer centric imagery for the selected character results in an improved gaze and engagement estimation. Critically, this is performed without sacrificing the other viewers’ viewing experience. In addition, we present findings on the perception of gaze direction for regularly viewed characters located off-center to the origin, where perceived gaze shifts from being aligned to misalignment increasingly as the distance between viewer and character increases. Finally, we discuss different viewpoints and the spatial relationship between objects.

Introduction

Gaze has been shown to matter for a number of aspects of social interaction, such as deictic referencing and viewers engagement (Ens, Lanir, Tang, Bateman, Lee, Piumsomboon, Billinghurst, 2019, Roberts, Rae, Duckworth, Moore, Aspin, 2013). When presenting virtual agents or characters, modelling and delivering gaze targets accurately is crucial. From entertainment and art to advertisement and information visualization, 2D displays, such as flat monitors, are ubiquitously used. However, this approach introduces severe limitations for an accurate preservation of gaze direction. Notably, 2D displays are associated with several powerful effects and illusions, most important is the Mona Lisa effect, where the gaze of the projected head appears to follow the viewer regardless of viewpoint (Al Moubayed, Edlund, Beskow, 2012, Hancock, Nacenta, Gutwin, Carpendale, 2009).

As reviewed in the next section, a variety of light field, 3D display hardware, or virtual reality systems can eliminate the Mona Lisa effect, and provide accurate gaze direction, but at greater cost in hardware (Jones et al., 2009) or limited to a single user (Lee, 2007). In this paper, we adapt and extend the previous dynamic anamorphosis approach (Arthur, Booth, Ware, 1993, Ravnik, Batagelj, Kverh, Solina, 2014), which adapts itself to the changing position of the observer so that wherever the observer moves, he sees the same undeformed image. We propose a dynamic adaptive anamorphosis method, where we render selected objects from the selected viewer (VIP viewer)’s perspective, while the rest of the scene is rendered with a fixed direction normal perspective. The VIP viewer can be selected for the anamorphic experience using either face recognition or some other distinguishable tracked feature used for identification (e.g., Fathi et al., 2012). The method can better preserve nonverbal cues, including gaze, and we show how it improves the viewer experience. Also, other viewers in the room can engage with the displayed scene, where only the selected objects will be slightly deformed from their viewpoint. Given that human perception has shown that the adjustment to oblique viewing is achieved before the contents of the image are interpreted (Man and Vision, 1982), we propose that subtle variations in perspective do not interfere significantly for the engagement of non-selected viewers whilst improving the VIP viewer’s experience.

We evaluated the effectiveness of our method by measuring the ability of viewers to accurately judge which target each character is gazing at, the ability to discriminate whether each character is looking directly into their eyes or not, and the level of engagement whilst watching characters performing movements. Results revealed that the accuracy of the estimation of object-focused gaze varies across different audience members (VIP viewer and non-VIP viewer) and viewer position. It further varies across different character locations. The clear trends are that estimations for the VIP viewer are always superior. There is little obvious difference between the experience for VIP and non-VIP viewers at the central position. We also looked at when characters placed at off-center locations of screen, which is not explicitly addressed by previous work. The farther away a character is from the viewer, the more discrepancy in estimation appeared. For mutual gaze, we found the VIP viewer could always distinguish between being looked at and gaze to one side of them. The non-VIP performed badly in this test and again only when they were off the central position, but importantly not worse than traditional 2D display. For estimating viewer engagement, we found the VIP viewer rated the character rendered in anamorphic perspective higher than the rest of the characters, but it does not interfere with the other viewers’ engagement. In a deployed entertainment scenario, the ‘VIP’ viewer may switch to another ‘VIP’ according to tracking of viewers and content of the experience. This demonstration and result thus motivates the further study in the context of an art installation or video conferencing.

The rest of this paper is organized as follows. In the next section, we review related work in the area of fish tank virtual reality, anamorphosis & non linear projection and eye gaze & engagement evaluation. Section 3 contains a description of the system implementation. The experiment is covered in Section 4. Finally, we present discussions of the results, conclusions and future work.

Section snippets

Fish tank virtual reality

Fish tank virtual reality (FTVR), where a stereo image of a three dimensional scene viewed on a monitor using a perspective projection coupled to the head position of the observer, was originally proposed with a single 2D display (Ware and Franck, 1996). The important finding of the original FTVR studies was a comparison of different visual cues. For a variety of 3D interactions, they found that while head-tracking and stereo cues together were best, head-tracking alone resulted in better

Method

We adapt dynamic anamorphosis (Ravnik et al., 2014) technique with offset perspective projection, and propose a nonlinear projection method. The idea is to construct a nonlinear projection of objects in a scene using multiple linear perspectives, particularly, a VIP viewer perspective (see Fig. 1(a)) and a normal or orthogonal perspective (see Fig. 1(b)).

In more detail, the off-axis perspective projection applies a skewed transform from the offset eye view point to the display’s plane (

Experiment

The purpose of the study was to show that our implementation can provide one VIP viewer greater eye contact and higher engagement, without interfering with the other viewers’ experience. We concentrated our investigation on three major aspects in the communication of attention: object-focused gaze, mutual gaze and user engagement. For each trial, while characters fixed their gaze, the participant was asked: Which object is being looked at? Are you being looked at? Or asked to rate engagement

Lesson learned & design recommendations

The most important lesson from this experiment is that the VIP guest results in an improved gaze and engagement estimation for one selected character. This is performed without sacrificing the other guests’ viewing experience. Our method could be used for creating interactive experiences in a conventional 2D display. For example, Turtle Talk with Crush, an interactive show at Disney California Adventure® Park where guests can chat live with crush the sea turtle. We could render the turtle (the

Conclusion

We have presented the design, implementation and evaluation of our adaptive dynamic anamorphosis technique. The contributions of this paper are twofold. First, we extend the idea of dynamic anamorphosis for multiple viewers simultaneously, rendered using non-linear projection as a selective combination of offset perspective anamorphic projection and normal perspective projection (Section 3). To drive anamorphic projection we need to know the position of the VIP viewer’s eyes, so that if the

CRediT authorship contribution statement

Ye Pan: Conceptualization, Methodology, Software, Formal analysis, Validation, Writing - original draft. Kenny Mitchell: Supervision, Conceptualization, Methodology, Software, Resources, Funding acquisition, Project administration, Writing - review & editing.

Declaration of Competing Interest

Authors declare that they have no conflict of interest.

References (41)

  • M. Deering

    High resolution virtual reality

    SIGGRAPH Comput. Graph.

    (1992)
  • A. Fathi et al.

    Social interactions: a first-person perspective

    2012 IEEE Conference on Computer Vision and Pattern Recognition

    (2012)
  • J. Francone et al.

    Using the users point of view for interaction on mobile devices

    Proceedings of the 23rd Conference on lInteraction Homme-Machine

    (2011)
  • J.J. Gibson et al.

    Perception of another person’s looking behavior

    Am. J. Psychol.

    (1963)
  • N. Glas et al.

    Definitions of engagement in human-agent interaction

    International Workshop on Engagment in Human Computer Interaction (ENHANCE)

    (2015)
  • D. Gotsch et al.

    Telehuman2: a cylindrical light field teleconferencing system for life-size 3D human telepresence

    SIGCHI

    (2018)
  • M. Hancock et al.

    The effects of changing projection geometry on the interpretation of 3D orientation on tabletops

    Proceedings of the ACM International Conference on Interactive Tabletops and Surfaces

    (2009)
  • A. Jones et al.

    Achieving eye contact in a one-to-many 3D video teleconferencing system

    TOG

    (2009)
  • K. Kim et al.

    Telehuman: effects of 3D perspective on gaze and pose estimation with a life-size cylindrical telepresence pod

    SIGCHI

    (2012)
  • R. Kooima

    Generalized perspective projection

    J. Sch. Electron. Eng. Comput. Sci

    (2009)
  • Cited by (1)

    View full text