research-article

Vid2Player: Controllable Video Sprites That Behave and Appear Like Professional Tennis Players

Authors:
Haotian Zhang

Stanford University

Stanford University
View Profile

,
Cristobal Sciutto

Stanford University

Stanford University
View Profile

,
Maneesh Agrawala

Stanford University

Stanford University
View Profile

,
Kayvon Fatahalian

Stanford University

Stanford University
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 40 Issue 3Article No.: 24pp 1–16https://doi.org/10.1145/3448978

Published:05 May 2021Publication History

ACM Transactions on Graphics

Abstract

We present a system that converts annotated broadcast video of tennis matches into interactively controllable video sprites that behave and appear like professional tennis players. Our approach is based on controllable video textures and utilizes domain knowledge of the cyclic structure of tennis rallies to place clip transitions and accept control inputs at key decision-making moments of point play. Most importantly, we use points from the video collection to model a player’s court positioning and shot selection decisions during points. We use these behavioral models to select video clips that reflect actions the real-life player is likely to take in a given match-play situation, yielding sprites that behave realistically at the macro level of full points, not just individual tennis motions. Our system can generate novel points between professional tennis players that resemble Wimbledon broadcasts, enabling new experiences, such as the creation of matchups between players that have not competed in real life or interactive control of players in the Wimbledon final. According to expert tennis players, the rallies generated using our approach are significantly more realistic in terms of player behavior than video sprite methods that only consider the quality of motion transitions during video synthesis.

The supplementary material/video are available at our https://cs.stanford.edu/~haotianz/research/vid2player/ project website.

Supplemental Material

Available for Download

zip

zhang.zip (146.1 MB)

Supplemental movie, appendix, image and software files for, Vid2Player: Controllable Video Sprites That Behave and Appear Like Professional Tennis Players

References

Sai Bi, Kalyan Sunkavalli, Federico Perazzi, Eli Shechtman, Vladimir G. Kim, and Ravi Ramamoorthi. 2019. Deep CG2Real: Synthetic-to-real translation via image disentanglement. In Proceedings of the IEEE International Conference on Computer Vision. 2730–2739.Google ScholarCross Ref
G. Bradski. 2000. The OpenCV library. Dr. Dobb’s J. Softw. Tools (2000).Google Scholar
H. Brody, R. Cross, and C. Lindsey. 2004. The Physics and Technology of Tennis. Racquet Tech Publishing.Google Scholar
Caroline Chan, Shiry Ginosar, Tinghui Zhou, and Alexei A. Efros. 2019. Everybody dance now. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’19).Google Scholar
Huiwen Chang, Jingwan Lu, Fisher Yu, and Adam Finkelstein. 2018. Pairedcyclegan: Asymmetric style transfer for applying and removing makeup. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 40–48.Google ScholarCross Ref
Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. 2018. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8789–8797.Google ScholarCross Ref
Alexei A. Efros, Alexander C. Berg, Greg Mori, and Jitendra Malik. 2003. Recognizing action at a distance. In Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV’03). IEEE Computer Society, 726. Google ScholarCross Ref
Dirk Farin, Susanne Krabbe, Wolfgang Effelsberg et al. 2003. Robust camera calibration for sport videos using court models. In Storage and Retrieval Methods and Applications for Multimedia 2004, Vol. 5307. International Society for Optics and Photonics, 80–91.Google Scholar
Tharindu Fernando, Simon Denman, Sridha Sridharan, and Clinton Fookes. 2019. Memory augmented deep generative models for forecasting the next shot location in tennis. IEEE Trans. Knowl. Data Eng. 32, 9 (2019), 1785–1797. DOI:https://doi.org/10.1109/TKDE.2019.2911507Google ScholarDigital Library
Matthew Flagg, Atsushi Nakazawa, Qiushuang Zhang, Sing Bing Kang, Young Kee Ryu, Irfan Essa, and James M. Rehg. 2009. Human video textures. In Proceedings of the Symposium on Interactive 3D Graphics and Games (I3D’09). Association for Computing Machinery, New York, NY, 199–206.Google Scholar
Oran Gafni, Lior Wolf, and Yaniv Taigman. 2019. Vid2Game: Controllable characters extracted from real-world videos. Retrieved from https://arXiv:1904.08379.Google Scholar
Rohit Girdhar, Georgia Gkioxari, Lorenzo Torresani, Manohar Paluri, and Du Tran. 2018. Detect-and-track: Efficient pose estimation in videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 350–359.Google ScholarCross Ref
Rıza Alp Güler, Natalia Neverova, and Iasonas Kokkinos. 2018. Densepose: Dense human pose estimation in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7297–7306.Google ScholarCross Ref
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). IEEE, 2980–2988.Google Scholar
P. Isola, J. Zhu, T. Zhou, and A. A. Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 5967–5976. DOI:https://doi.org/10.1109/CVPR.2017.632Google Scholar
Hyeongwoo Kim, Pablo Garrido, Ayush Tewari, Weipeng Xu, Justus Thies, Matthias Niessner, Patrick Pérez, Christian Richardt, Michael Zollhöfer, and Christian Theobalt. 2018. Deep video portraits. ACM Trans. Graph. 37, 4, Article 163 (July 2018), 14 pages. DOI:https://doi.org/10.1145/3197517.3201283Google ScholarDigital Library
Lucas Kovar, Michael Gleicher, and Frédéric Pighin. 2002. Motion graphs. In Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’02). Association for Computing Machinery, New York, NY, 473–482.Google ScholarDigital Library
J. Lai, C. Chen, P. Wu, C. Kao, M. Hu, and S. Chien. 2012. Tennis real play. IEEE Trans. Multimedia 14, 6 (Dec. 2012), 1602–1617. DOI:https://doi.org/10.1109/TMM.2012.2197190Google ScholarDigital Library
H. Le, P. Carr, Y. Yue, and P. Lucey. 2017. Data-driven ghosting using deep imitation learning. In Proceedings of the MIT Sloan Sports Analytics Conference (MITSSAC’17). Boston, MA.Google Scholar
Jehee Lee, Jinxiang Chai, Paul S. A. Reitsma, Jessica K. Hodgins, and Nancy S. Pollard. 2002. Interactive control of avatars animated with human motion data. ACM Trans. Graph. 21, 3 (July 2002), 491–500. DOI:https://doi.org/10.1145/566654.566607Google ScholarDigital Library
L. Liu, W. Xu, M. Zollhoefer, H. Kim, F. Bernard, M. Habermann, W. Wang, and C. Theobalt. 2019. Neural animation and reenactment of human actor videos. ACM Trans. Graph. 38, 5 (2019), 1–14.Google ScholarDigital Library
N. Owens, C. Harris, and C. Stennett. 2003. Hawk-eye tennis system. In Proceedings of the International Conference on Visual Information Engineering (VIE’03). 182–185. Google Scholar
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12 (2011), 2825–2830.Google ScholarDigital Library
Paul Power, Hector Ruiz, Xinyu Wei, and Patrick Lucey. 2017. Not all passes are created equal: Objectively measuring the risk and reward of passes in soccer from tracking data. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’17). Association for Computing Machinery, New York, NY. https://doi.org/10.1145/3097983.3098051Google ScholarDigital Library
Scott Schaefer, Travis McPhail, and Joe Warren. 2006. Image deformation using moving least squares. ACM Trans. Graph. 25, 3 (July 2006), 533–540. DOI:https://doi.org/10.1145/1141911.1141920Google ScholarDigital Library
Arno Schödl and Irfan A. Essa. 2002. Controlled animation of video sprites. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA’02). Association for Computing Machinery, 121–127.Google Scholar
Arno Schödl, Richard Szeliski, David H. Salesin, and Irfan Essa. 2000. Video textures. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’00). ACM Press/Addison-Wesley Publishing Co., 489–498. https://doi.org/10.1145/344779.345012Google ScholarDigital Library
Second Spectrum, Inc.2020. Second Spectrum Corporate Website. Retrieved from https://www.secondspectrum.com.Google Scholar
Bernard W Silverman. 2018. Density Estimation for Statistics and Data Analysis. Routledge.Google Scholar
Marty Smith. 2017. Absolute Tennis: The Best and Next Way to Play the Game. New Chapter Press.Google Scholar
StatsPerform, Inc.2020. SportVU 2.0: Real-Time Optical Tracking. Retrieved from https://www.statsperform.com/team-performance/football/optical-tracking.Google Scholar
Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018a. Video-to-video synthesis. In Advances in Neural Information Processing Systems. MIT Press, 1144–1156.Google Scholar
Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018b. High-resolution image synthesis and semantic manipulation with conditional GANs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
X. Wei, P. Lucey, S. Morgan, M. Reid, and S. Sridharan. 2016. The thin edge of the wedge: Accurately predicting shot outcomes in tennis using style and context priors. In Proceedings of the MIT Sloan Sports Analytics Conference (MITSSAC’16). Boston, MA.Google Scholar
X. Wei, P. Lucey, S. Morgan, and S. Sridharan. 2016. Forecasting the next shot location in tennis using fine-grained spatiotemporal tracking data. IEEE Trans. Knowl. Data Eng. 28, 11 (Nov. 2016), 2988–2997.Google ScholarDigital Library
N. Xu, B. Price, S. Cohen, and T. Huang. 2017. Deep image matting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 311–320. DOI:https://doi.org/10.1109/CVPR.2017.41Google Scholar
J. Zhu, T. Park, P. Isola, and A. A. Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). 2242–2251. Google Scholar

Index Terms

Vid2Player: Controllable Video Sprites That Behave and Appear Like Professional Tennis Players
1. Computing methodologies
  1. Computer graphics
    1. Animation
    2. Image manipulation
      1. Image-based rendering

Recommendations

A table tennis game for three players
OZCHI '06: Proceedings of the 18th Australia conference on Computer-Human Interaction: Design: Activities, Artefacts and Environments

Table tennis is a game that can provide healthy exercise and is also a social pastime for players of all ages across the world. However, players have to be collocated to play, and three players cannot usually play at the same time in fair or equitable ...
Read More
Controlled animation of video sprites
SCA '02: Proceedings of the 2002 ACM SIGGRAPH/Eurographics symposium on Computer animation

We introduce a new optimization algorithm for video sprites to animate realistic-looking characters. Video sprites are animations created by rearranging recorded video frames of a moving object. Our new technique to find good frame arrangements is based ...
Read More
On the (page) ranking of professional tennis players
EPEW'12: Proceedings of the 9th European conference on Computer Performance Engineering

We explore the relationship between official rankings of professional tennis players and rankings computed using a variant of the PageRank algorithm as proposed by Radicchi in 2011. We show Radicchi's equations follow a natural interpretation of the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Graphics Volume 40, Issue 3
June 2021
264 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/3463476
Editor:
Marc Alexa
TU Berlin, Germany
Issue’s Table of Contents
Copyright © 2021 Association for Computing Machinery.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 May 2021
- Accepted: 1 February 2021
- Revised: 1 December 2020
- Received: 1 August 2020
Published in tog Volume 40, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Image-based rendering
video manipulation
data-driven animation
controllable characters
Qualifiers
- research-article
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 93
  Total Citations
  View Citations
- 626
  Total Downloads
- Downloads (Last 12 months)114
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Vid2Player: Controllable Video Sprites That Behave and Appear Like Professional Tennis Players

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

A table tennis game for three players

Controlled animation of video sprites

On the (page) ranking of professional tennis players