Deep soccer analytics: learning an action-value function for evaluating soccer players

Liu, Guiliang; Luo, Yudong; Schulte, Oliver; Kharrat, Tarak

doi:10.1007/s10618-020-00705-9

Deep soccer analytics: learning an action-value function for evaluating soccer players

Published: 21 July 2020

Volume 34, pages 1531–1559, (2020)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Guiliang Liu¹,
Yudong Luo ORCID: orcid.org/0000-0003-0145-703X¹,
Oliver Schulte¹ &
…
Tarak Kharrat²

3600 Accesses
37 Citations
20 Altmetric
1 Mention
Explore all metrics

Abstract

Given the large pitch, numerous players, limited player turnovers, and sparse scoring, soccer is arguably the most challenging to analyze of all the major team sports. In this work, we develop a new approach to evaluating all types of soccer actions from play-by-play event data. Our approach utilizes a Deep Reinforcement Learning (DRL) model to learn an action-value Q-function. To our knowledge, this is the first action-value function based on DRL methods for a comprehensive set of soccer actions. Our neural architecture fits continuous game context signals and sequential features within a play with two stacked LSTM towers, one for the home team and one for the away team separately. To validate the model performance, we illustrate both temporal and spatial projections of the learned Q-function, and conduct a calibration experiment to study the data fit under different game contexts. Our novel soccer Goal Impact Metric (GIM) applies values from the learned Q-function, to measure a player’s overall performance by the aggregate impact values of his actions over all the games in a season. To interpret the impact values, a mimic regression tree is built to find the game features that influence the values most. As an application of our GIM metric, we conduct a case study to rank players in the English Football League Championship. Empirical evaluation indicates GIM is a temporally stable metric, and its correlations with standard measures of soccer success are higher than that computed with other state-of-the-art soccer metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The effect of weather conditions on scores at the United States Masters golf tournament

Article Open access 07 September 2023

Harry Jowett & Ian D. Phillips

The video assistant referee in football

Article 12 April 2024

Jaime A. Teixeira da Silva, Serhii Nazarovets, … Nicolas Scelles

The Impact of Big Data and Sports Analytics on Professional Football: A Systematic Literature Review

Notes

https://www.optasports.com/
https://www.skysports.com/football/news/11688/11361634/
https://www.bbc.com/sport/football/43641225
The classifier is implemented with a neural network rather than CatBoost in (Decroos et al. 2019) due to the size of dataset. We discuss our VAEP implementation further in the limitations (Sect. 10.2).
In Figs. 8 and 9 , we omit players from teams that play less than 40 games in the 2017–2018 season.

References

Albert J, Glickman ME, Swartz TB, Koning RH (2017) Handbook of Statistical Methods and Analyses in Sports. CRC Press, Boca Raton
Book Google Scholar
Ali A (2011) Measuring soccer skill performance: a review. Scand J Med Scin Sports 21(2):170–183
Article Google Scholar
Ba J, Caruana R (2014) Do deep nets really need to be deep? In: Advances in Neural Information Processing Systems, pp 2654–2662
Bornn L, Cervone D, Fernandez J (2018) Soccer analytics: unravelling the complexity of “the beautiful game”. Significance 15(3):26–29
Article Google Scholar
Bransen L, Van Haaren J (2018) Measuring football players’ on-the-ball contributions from passes during games. In: Machine Learning and Data Mining for Sports Analytics, Proceedings of the 5th International Workshop. Springer, pp 3–15
Brooks J, Kerr M, Guttag J (2016) Developing a data-driven player ranking in soccer using predictive model weights. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp 49–55
Cervone D, D’Amour A, Bornn L, Goldsberry K (2014) Pointwise: predicting points and valuing decisions in real time with NBA optical tracking data. In: Proceedings of the 8th Annual MIT Sloan Sports Analytics Conference, vol 28
Cervone D, D’Amour A, Bornn L, Goldsberry K (2016) A multiresolution stochastic process model for predicting basketball possession outcomes. J Am Stat Assoc 111(514):585–599
Article MathSciNet Google Scholar
Decroos T, Bransen L, Haaren JV, Davis J (2019) Actions speak louder than goals: valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4–8, 2019, pp 1851–1861
Dick U, Brefeld U (2019) Learning to rate player positioning in soccer. Big Data 7(1):71–82
Article Google Scholar
Fernández J, Barcelona F, Bornn L, Cervone D (2019) Decomposing the immeasurable sport: a deep learning expected possession value framework for soccer. In: Proceedings MIT Sloan Sports Analytics Conference
Gudmundsson J, Horton M (2017) Spatio-Temporal Analysis of Team Sports. ACM Comput Surv 50(2):22:1–22:34. https://doi.org/10.1145/3054132
Article Google Scholar
Hausknecht MJ, Stone P (2015) Deep recurrent Q-learning for partially observable MDPS. In: Proceedings of the 2015 AAAI Fall Symposia, Arlington, Virginia, USA, November 12–14, 2015, pp 29–37. CoRR. arXiv:1507.06527
Kharrat T, McHale IG, Peña JL (2019) Plus-minus player ratings for soccer. Eur J Oper Res 283:726–736
Article Google Scholar
Liu G, Schulte O (2018) Deep reinforcement learning in ice hockey for context-aware player evaluation. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI-18, ijcai.org, pp 3442–3448
Liu G, Zhu W, Schulte O (2018) Interpreting deep sports analytics: Valuing actions and players in the NHL. In: International workshop on machine learning and data mining for sports analytics. Springer, pp 69–81
Macdonald B (2011) A regression-based adjusted plus-minus statistic for NHL players. J Quant Anal Sports 7(3):29
Google Scholar
McCallum A (1996) Learning to use selective attention and short-term memory in sequential tasks. In: From animals to animats 4: proceedings of the fourth international conference on simulation of adaptive behavior, vol 4. MIT Press, p 315
McHale IG, Scarf PA, Folker DE (2012) On the development of a soccer player performance rating system for the english premier league. Interfaces 42(4):339–351
Article Google Scholar
Mnih V, Kavukcuoglu K, Silver D et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Article Google Scholar
Ng AY, Harada D, Russell S (1999) Policy invariance under reward transformations: theory and application to reward shaping. Proceedings of the 16th International Conference on Machine Learning (ICML 1999), Bled, Slovenia, pp. 278–287
Puterman ML, Patrick J (2017) Dynamic programming. In: Encyclopedia of machine learning and data mining, pp 377–388
Routley K (2015) A markov game model for valuing player actions in ice hockey. Master’s thesis, Simon Fraser University
Routley K, Schulte O (2015) A markov game model for valuing player actions in ice hockey. In: Proceedings of the International Conference on Uncertainty in Artificial Intelligence (UAI), pp 782–791
Schulte O, Khademi M, Gholami S, Zhao Z, Javan M, Desaulniers P (2017a) A markov game model for valuing actions, locations, and team performance in ice hockey. Data Mining and Knowledge Discovery, pp 1–23
Schulte O, Zhao Z, Javan M, Desaulniers P (2017b) Apples-to-apples: clustering and ranking NHL players using location information and scoring impact. In: Proceedings MIT Sloan Sports Analytics Conference
Schultze SR, Wellbrock CM (2018) A weighted plus/minus metric for individual soccer player performance. J Sports Anal 4(2):121–131
Article Google Scholar
Schumaker RP, Solieman OK, Chen H (2010) Research in sports statistics. Sports Data Mining, Integrated Series in Information Systems, vol 26. Springer, US, pp 29–44
Song Y, Xu M, Zhang S, Huo L (2017) Generalization tower network: A novel deep neural network architecture for multi-task learning. arXiv preprint arXiv:1710.10036
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT Press, Cambridge
Swartz TB, Arce A (2014) New insights involving the home team advantage. Int J Sports Sci Coach 9(4):681–692
Article Google Scholar
Tsitsiklis JN, Van Roy B (1997) Analysis of temporal-diffference learning with function approximation. In: Advances in Neural Information Processing Systems, pp 1075–1081
Van Haaren J, Van den Broeck G, Meert W, Davis J (2016) Lifted generative learning of markov logic networks. Mach Learn 103(1):27–55
Article MathSciNet Google Scholar
Van Roy M, Robberechts P, Decroos T, Davis J (2017) Valuing on-the-ball actions in soccer: a critical comparison of XT and VAEP. In: Workshop on Team Sports AAAI 2020

Download references

Acknowledgements

This work was supported by Strategic Project Grant from the National Sciences and Engineering Council of Canada, and a GPU donation from NVIDIA Corporation. We are indebted for helpful discussion and comments to Norm Ferns, Evin Keane, and Bahar Pourbabee from Sportlogiq.

Author information

Authors and Affiliations

School of Computing Science, Simon Fraser University, and Sportlogiq Predictive Analytics, Burnaby, BC, Canada
Guiliang Liu, Yudong Luo & Oliver Schulte
Management School, University of Liverpool, Liverpool, UK
Tarak Kharrat

Authors

Guiliang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yudong Luo
View author publications
You can also search for this author in PubMed Google Scholar
Oliver Schulte
View author publications
You can also search for this author in PubMed Google Scholar
Tarak Kharrat
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yudong Luo.

Additional information

Responsible editor: Ira Assent, Carlotta Domeniconi, Aristides Gionis, Eyke Hüllermeier.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 622 KB)

Proof of Proposition 1

The data record transitions from a state-action-player triple to another, possibly resulting in a non-zero reward (score or point in the context of sports). We denote the number of times such a transition occurs as

$$\begin{aligned} n_{D}[s,a,{ pl},s',a',{ pl}'] \end{aligned}$$

where the $'$ indicates the successor triple. We freely use this notation for marginal counts as well, for instance

$$\begin{aligned} n_{D}[s',a',{ pl}'] = \sum _{s,a,{ pl}}n_{D}[s,a,{ pl},s',a',{ pl}'] \end{aligned}$$

From the paper, we have the following equations for the Q-value-above-replacement and the GIM metrics:

$$\begin{aligned} { QAAR}^{i}(D)&= \sum _{s,a} n_{D}[s,a,{ pl}' = i] \big ( {{\,\mathrm{{\mathbb {E}}}\,}}_{s',a'}[Q_{{ team}}(s',a'|s,a,{ pl}'=i)] \nonumber \\&\quad - {{\,\mathrm{{\mathbb {E}}}\,}}_{s',a'}[Q_{{ team}}(s',a')|s,a] \big ) \end{aligned}$$

(7)

$$\begin{aligned} GIM^{i}(D)&= \sum _{s,a,s',a'}n[s,a,s',a',{ pl}'=i;D] \cdot \Big [Q_{{ team}}(s',a') \nonumber \\&\quad - {{\,\mathrm{{\mathbb {E}}}\,}}_{s'_{E},a'_{E}}[Q_{{ team}}(s'_{E},a'_{E})|s,a]\Big ] \end{aligned}$$

(8)

Now we have

$$\begin{aligned} GIM^{i}(D){\mathop {=}\limits ^{Eq.2}}&\sum _{s,a} \sum _{s',a'} n_{D}[s,a,s',a',{ pl}' = i] \Big (Q_{{ team}}(s',a')- {{\,\mathrm{{\mathbb {E}}}\,}}_{s'_{E},a'_{E}}[Q_{{ team}}(s'_{E},a'_{E})|s,a]\Big ) \nonumber \\ =&\sum _{s,a} n_{D}[s,a,{ pl}' = i] \sum _{s',a'} \frac{n_{D}[s,a,s',a',{ pl}' = i]}{n_{D}[s,a,{ pl}' = i]} Q_{{ team}}(s',a') \nonumber \\&- \sum _{s,a} n_{D}[s,a,{ pl}' = i] {{\,\mathrm{{\mathbb {E}}}\,}}_{s'_{E},a'_{E}}[Q_{{ team}}(s'_{E},a'_{E})|s,a] \end{aligned}$$

(9)

$$\begin{aligned} =&\sum _{s,a} n_{D}[s,a,{ pl}' = i] E[Q_{{ team}}(s',a'|s,a,{ pl}'=i)] \end{aligned}$$

(10)

$$\begin{aligned}&- \sum _{s,a} n_{D}[s,a,{ pl}' = i] {{\,\mathrm{{\mathbb {E}}}\,}}_{s'_{E},a'_{E}}[Q_{{ team}}(s'_{E},a'_{E})|s,a] \nonumber \\ =&\sum _{s,a} n_{D}[s,a,{ pl}' = i] \big ( {{\,\mathrm{{\mathbb {E}}}\,}}_{s'_{E},a'_{E}}[Q_{{ team}}(s'_{E},a'_{E}|s,a,{ pl}'=i)] \nonumber \\&\quad - {{\,\mathrm{{\mathbb {E}}}\,}}_{s'_{E},a'_{E}}[Q_{{ team}}(s'_{E},a'_{E})|s,a] \big ) \nonumber \\ {\mathop {=}\limits ^{Eq.1}}&{ QAAR}^{i}(D) \end{aligned}$$

(11)

Step (9) holds because the expectation $E[Q_{{ team}}(s',a'|s,a)]$ depends only on $s,a$, not on $s',a'$. Line (10) uses the empirical estimate of the expected Q-value $Q_{{ team}}(s',a')]$ given that player i acts next, computed from the maximum likelihood estimates of the transition probabilities:

$$\begin{aligned} {\hat{\sigma }}(s',a'|s,a,{ pl}' = i) = \frac{n_{D}[s,a,s',a',{ pl}' = i]}{n_{D}[s,a,{ pl}' = i]} \end{aligned}$$

The final conclusion (11) applies Equation (7).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, G., Luo, Y., Schulte, O. et al. Deep soccer analytics: learning an action-value function for evaluating soccer players. Data Min Knowl Disc 34, 1531–1559 (2020). https://doi.org/10.1007/s10618-020-00705-9

Download citation

Received: 12 September 2019
Accepted: 10 July 2020
Published: 21 July 2020
Issue Date: September 2020
DOI: https://doi.org/10.1007/s10618-020-00705-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep soccer analytics: learning an action-value function for evaluating soccer players

Abstract

Access this article

Similar content being viewed by others

The effect of weather conditions on scores at the United States Masters golf tournament

The video assistant referee in football

The Impact of Big Data and Sports Analytics on Professional Football: A Systematic Literature Review

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic supplementary material

Supplementary material 1 (pdf 622 KB)

Proof of Proposition 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Deep soccer analytics: learning an action-value function for evaluating soccer players

Abstract

Access this article

Similar content being viewed by others

The effect of weather conditions on scores at the United States Masters golf tournament

The video assistant referee in football

The Impact of Big Data and Sports Analytics on Professional Football: A Systematic Literature Review

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic supplementary material

Supplementary material 1 (pdf 622 KB)

Proof of Proposition 1

Proof of Proposition 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation