Abstract
Spatial point processes have been successfully used to model the relative efficiency of shot locations for each player in professional basketball games. Those analyses were possible because each player makes enough baskets to reliably fit a point process model. Goals in hockey are rare enough that a point process cannot be fit to each player’s goal locations, so novel techniques are needed to obtain measures of shot efficiency for each player. A Log-Gaussian Cox Process (LGCP) is used to model all shot locations, including goals, of each NHL player who took at least 500 shots during the 2011–2018 seasons. Each player’s LGCP surface is treated as an image and these images are then used in an unsupervised statistical learning algorithm that decomposes the pictures into a linear combination of spatial basis functions. The coefficients of these basis functions are shown to be a very useful tool to compare players. To incorporate goals, the locations of all shots that resulted in a goal are treated as a “perfect player” and used in the same algorithm (goals are further split into perfect forwards, perfect centres and perfect defence). These perfect players are compared to other players as a measure of shot efficiency. This analysis provides a map of common shooting locations, identifies regions with the most goals relative to the number of shots and demonstrates how each player’s shot location differs from scoring locations.
Funding source: Natural Sciences and Engineering Research Council of Canada
Award Identifier / Grant number: RGPIN-2015-04221
Award Identifier / Grant number: RGPIN-2014-06187
Funding source: CANSSI Collaborative Research Team Grant
Acknowledgment
We acknowledge the support of the Natural Sciences and Engineering Research Council of Canada (NSERC), [funding reference numbers RGPIN-2015-04221 and RGPIN-2014-06187]. Additional support was provided by a CANSSI Collaborative Research Team grant. We would also like to thank Michael Schuckers and Nathan Sandholtz for helpful conversations regarding this work.
Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
Research funding: This study was supported by Natural Sciences and Engineering Research Council of Canada (NSERC), [funding reference numbers RGPIN-2015-04221 and RGPIN-2014-06187] and CANSSI Collaborative Research Team grant.
Conflict of interest statement: The authors declare no conflicts of interest regarding this article.
References
Bachl, F. E., F. Lindgren, D. L. Borchers, and J. B. Illian. 2019. “Inlabru: An R package for Bayesian Spatial Modelling from Ecological Survey Data.” Methods in Ecology and Evolution 10 (6): 760–6, https://doi.org/10.1111/2041-210x.13168.Search in Google Scholar
Becker, D. 2017 April. “Space and Some Other Things: Point Process Models for Hockey Data.” In Ottawa Hockey Analytics Conference.Ottawa, Ontario: Carleton University.Search in Google Scholar
Brunet, J.-P., P. Tamayo, T. R. Golub, and J. P. Mesirov. 2004 March. “Metagenes and Molecular pattern Discovery Using Matrix Factorization.” Proceedings of the National Academy of Sciences 101 (12): 4164–9, https://doi.org/10.1073/pnas.0308531101.Search in Google Scholar
Cane, M. 2014. Using Shot Location Data for Team and Player Strategy. Pittsburgh, Pennsylvania: Pittsburgh Hockey Analytics workshop. http://blog.war-on-ice.com/wp-content/uploads/2014/11/20141106_PGH_Analytics_Shot_Location.pdf.Search in Google Scholar
Cervone, D., A. D’Amour, L. Bornn, and K. Goldsberry. 2016 April. “A Multiresolution Stochastic Process Model for Predicting Basketball Possession Outcomes.” Journal of the American Statistical Association 111 (514): 585–99, https://doi.org/10.1080/01621459.2016.1141685.Search in Google Scholar
Chalise, P., and B. L. Fridley. 2017 May. “Integrative Clustering of Multi-Level ‘Omic Data Based on Non-negative Matrix Factorization Algorithm.” PloS One 12 (5): e0176278. https://doi.org/10.1371/journal.pone.0176278.Search in Google Scholar
Cinlar, E., and R. A. Agnew. 1968. “On the Superposition of Point Processes.” Journal of the Royal Statistical Society: Series B 30 (3): 576–81, https://doi.org/10.1111/j.2517-6161.1968.tb00758.x.Search in Google Scholar
Diggle, P. J., P. Moraga, B. Rowlingson, and B. M. Taylor. 2013 November. “Spatial and Spatio-Temporal Log-Gaussian Cox Processes: Extending the Geostatistical Paradigm.” Statistical Science 28 (4): 542–63, https://doi.org/10.1214/13-sts441.Search in Google Scholar
Ellis, M. 2018. “NHL Game Data.” Also avaiable at https://kaggle.com/martinellis/nhl-game-data.Search in Google Scholar
Franks, A., A. Miller, L. Bornn, and K. Goldsberry. 2015 March. “Characterizing the Spatial Structure of Defensive Skill in Professional Basketball.” Annals of Applied Statistics 9 (1): 94–121, https://doi.org/10.1214/14-aoas799.Search in Google Scholar
Frigyesi, A., and M. Höglund. 2008 January. “Non-Negative Matrix Factorization for the Analysis of Complex Gene Expression Data: Identification of Clinically Relevant Tumor Subtypes.” Cancer Informatics 6: CIN.S606, https://doi.org/10.4137/cin.s606.Search in Google Scholar
Gaujoux, R., and C. Seoighe. 2010. “A Flexible Software Package for Nonnegative Matrix Factorization.” BMC Bioinformatics 11: 367. https://doi.org/10.1186/1471-2105-11-367.Search in Google Scholar
Hawerchuck. 2007. 2007–08 5v5 Goaltender Performance. Behindthenet Blog. http://www.behindthenet.ca/blog/2007/12/2007-08-5v5-goaltender-performance.html.Search in Google Scholar
Hohl, G. 2017. Behind the Numbers: The Issues with Binning, QoC, and Scoring Chances. Hockey Graphs. https://hockey-graphs.com/2017/02/06/behind-the-numbers-the-issues-with-binning-qoc-and-scoring-chances/.Search in Google Scholar
Hutchins, L. N., S. M. Murphy, P. Singh, and J. H. Graber. 2008 December. “Position-dependent Motif Characterization Using Non-negative Matrix Factorization.” Bioinformatics 24 (23): 2684–90, https://doi.org/10.1093/bioinformatics/btn526.Search in Google Scholar
Kasan, S. 2008. “Off-ice Officials Are a Fourth Team at Every Game.” Also avaiable at https://www.nhl.com/news/off-ice-officials-are-a-fourth-team-at-every-game/c-388400.Search in Google Scholar
Krzywicki, K. 2005. “Shot Quality Model: A Logistic Regression Approach to Assessing NHL Shots on Goal.” Also avaiable at https://hockeyanalytics.com/Research_files/Shot_Quality_Krzywicki.pdf.Search in Google Scholar
Krzywicki, K. 2010. “NHL Shot Quality 2009–10: A Look at Shot Angles and Rebounds.” Also avaiable at https://hockeyanalytics.com/Research_files/SQ-RS0910-Krzywicki.pdf.Search in Google Scholar
Lee, D. D., and H. S. Seung. 2000. “Algorithms for Non-negative Matrix Factorization.” In 13th International Conference on Neural Information Processing Systems. Denver, Colorado: MIT Press, https://doi.org/10.1117/12.405857.Search in Google Scholar
Lin, X., and P. C. Boutros. 2019. NNLM: Fast and Versatile Non-negative Matrix Factorization. R package version 0.4.3. Also available at http://www.lukebornn.com/papers/sandholtz_sloan_2019.pdf.Search in Google Scholar
Lindgren, F., and H. Rue. 2015 February. “Bayesian Spatial Modelling with R-INLA.” Journal of Statistical Software 63 (1): 1–25, https://doi.org/10.18637/jss.v063.i19.Search in Google Scholar
Miller, A., L. Bomn, R. Adams, and K. Goldsberry. 2014. “Factorized Point Process Intensities: A Spatial Analysis of Professional Basketball.” In 31st International Conference on Machine Learning, ICML 2014, 1, 398–414. International Machine Learning Society (IMLS).Search in Google Scholar
Møller, J., A. R. Syversveen, and R. P. Waagepetersen. 1998. “Log Gaussian Cox Processes.” Scandinavian Journal of Statistics 25 (3): 451–82, https://doi.org/10.1111/1467-9469.00115.Search in Google Scholar
R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.Search in Google Scholar
Rue, H., S. Martino, and N. Chopin. 2009. “Approximate Bayesian Inference for Latent Gaussian Models by Using Integrated Nested Laplace Approximations.” Journal of the Royal Statistical Society: Series B 71 (2): 319–92, https://doi.org/10.1111/j.1467-9868.2008.00700.x.Search in Google Scholar
Ryder, A. 2004a. “Goal Prevention.” Also avaiable at https://hockeyanalytics.com/Research_files/Goal_Prevention_2004.pdf.Search in Google Scholar
Ryder, A. 2004b. “Shot Quality.” Also avaiable at https://hockeyanalytics.com/Research_files/Shot_Quality.pdf.Search in Google Scholar
Sandholtz, N., J. Mortensen, and L. Bornn. 2019. “Chuckers: Measuring Lineup Shot Distribution Optimality Using Spatial Allocative Efficiency Models.” In MIT Sloan Sports Analytics Conference, Boston, MA. http://www.lukebornn.com/papers/sandholtz_sloan_2019.pdf.Search in Google Scholar
Schuckers, M., and B. Macdonald. 2014. “Accounting for Rink Effects in the National Hockey League’s Real Time Scoring System.” arXiv preprint arXiv:1412.1035Search in Google Scholar
Schuckers, M. “A Defense Independent Rating of NHL Goaltenders Using Spatially Smoothed Save Percentage Maps.” In MIT Sloan Analytics Conference, Boston, MA.Search in Google Scholar
Simpson, D., J. Illian, F. Lindgren, and S. Sørbye. 2011 “Going off Grid: Computationally Efficient Inference for Log-Gaussian Cox Processes.” Biometrica 103 (1). https://doi.org/10.1093/biomet/asv064.Search in Google Scholar
Vavasis, S. A. 2007. “On the Complexity of Nonnegative Matrix Factorization.” SIAM Journal on Optimization 203: 1364–77, https://doi.org/10.1137/070709967.Search in Google Scholar
© 2020 Walter de Gruyter GmbH, Berlin/Boston