Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter May 29, 2020

Unsupervised methods for identifying pass coverage among defensive backs with NFL player tracking data

  • Rishav Dutta , Ronald Yurko and Samuel L. Ventura EMAIL logo

Abstract

Statistical analysis of defensive players in football has lagged behind that of offensive players, special teams, and coaching decisions, largely because data on individual defensive players has historically been lacking. With the introduction of player tracking data from the NFL, researchers can now tackle these problems. However, event and strategy annotations in the NFL’s tracking data are limited, especially when it comes to describing what defensive players do on each play. Moreover, methods for creating these annotations typically require extensive human labeling, which is difficult and expensive. Because of the importance of the passing game and the limited prior research on the defensive side of passing, we provide annotations for the pass coverage types of cornerbacks using unsupervised learning techniques, which require no training data. We define a set of features from the tracking data that distinguish between “man” and “zone” coverage. We use mixture models to create clusters corresponding to each group, allowing us to provide probabilistic assignments to each coverage type (or cluster). Additionally, we quantify each feature’s influence in distinguishing defensive pass coverage types. Our work makes possible several potential avenues of future NFL research into defensive backs and pass coverage strategies.

Appendix

Figure 12 shows the proportion of cornerback coverages in man (blue) and zone (orange) by quarter. While some small differences exist in the first four quarters, they are mostly negligible, meaning we see no apparent trend in the type of coverage throughout the game. Interestingly, we see an increase in man coverage in overtime. However, this result is not statistically significant, since there were only 40 overtime plays in our dataset from the first 6 weeks of the 2017 season. This brief analysis does not control for factors like the score differential or opposing offensive formation, which may influence coverage type.

Figures 13 and 14 show the top-10 teams by proportion of zone and man coverage, respectively. Tampa Bay, Chicago, and Green Bay have the highest rate man coverage play during this short span of 6 weeks, while Washington, Buffalo, and the New York Giants have the highest rate of zone coverage. Overall, man coverage is used more often than zone coverage by all 32 NFL teams in this provided sample of data.

Figure 13: Top 10 teams with the highest Zone Coverage percentage.
Figure 13:

Top 10 teams with the highest Zone Coverage percentage.

Figure 14: Top 10 teams with the highest Man Coverage percentage.
Figure 14:

Top 10 teams with the highest Man Coverage percentage.

Finally, Figures 15 and 16 show the top-10 cornerbacks by their proportion of man and zone coverage, respectively (minimum 50 coverages). Bryce Callahan from the Chicago Bears led the NFL in percentage of man coverages during the first 6 weeks of the 2017 season with almost 80%, followed by Damarious Randall, Kevin King, Phillip Gaines, and others. Joshua Shaw, who played for the Bengals in 2017, led the league in percentage of zone coverages, with about 60%, followed by Janoris Jenkins, Bobby McCaine, Marcus Peters, and others.

Figure 15: Top 10 players with the highest Man Coverage percentage.
Figure 15:

Top 10 players with the highest Man Coverage percentage.

Figure 16: Top 10 players with the highest Zone Coverage percentage.
Figure 16:

Top 10 players with the highest Zone Coverage percentage.

Figure 17 shows no apparent relationship between the down and the coverage type of cornerbacks.

Figure 17: Man:Zone percentage by down.
Figure 17:

Man:Zone percentage by down.

References

Banfield, J. D. and A. E. Raftery. 1993. “Model-Based Gaussian and Non-Gaussian Clsutering.” Biometrics 49:803–821, https://www.stat.washington.edu/raftery/Research/PDF/banfield1993.pdf.10.2307/2532201Search in Google Scholar

Bialkowski, A., P. Lucey, P. Carr, Y. Yue, S. Sridharan, and I. Matthews. 2015. “Identifying Team Style in Soccer using Formations Learned from Spatiotemporal Tracking Data.” IEEE International Conference on Data Mining Workshops, ICDMW, 2015. 9–14. 10.1109/ICDMW.2014.167, https://dl.acm.org/doi/pdf/10.1145/3054132.10.1109/ICDMW.2014.167Search in Google Scholar

Burke, B. 2018. “We Created Better Pass-Rusher and Pass-Blocker Stats: How they Work.” http://www.espn.com/nfl/story/_/id/24892208/creating-better-nfl-pass-blocking-pass-rushing-stats-analytics-explainer-faq-how-work.Search in Google Scholar

Burke, B. 2019. “Deepqb: Deep Learning with Player Tracking to Quantify Quarterback Decision-Making & performance.” http://www.sloansportsconference.com/wp-content/uploads/2019/02/DeepQB.pdf.Search in Google Scholar

Chu, D., L. Wu, M. Reyers, and J. Thomson. 2019. “Routes to Success.” NFL Big Data Bowl, https://danichusfu.github.io/files/Big_Data_Bowl.pdf.Search in Google Scholar

Fop, M. and T. B. Murphy. 2018. “Variable Selection Methods for Model-Based Clustering.” Statistics Surveys 12:18–65, https://doi.org/10.1214/18-SS119.10.1214/18-SS119Search in Google Scholar

Gudmundsson, J. and M. Horton. 2017. “Spatio-Temporal Analysis of Team Sports.” ACM Comput. Surv. 50:22:1–22:34, https://dl.acm.org/doi/10.1145/3054132.10.1145/3054132Search in Google Scholar

Hartigan, J. A. 1975. Clustering Algorithms. New York, NY, USA: John Wiley & Sons, Inc., 99th edition.Search in Google Scholar

Hennig, C. 2013. “Measurement of Quality in Cluster Analysis,” http://www.homepages.ucl.ac.uk/∼ucakche/presentations/cqualitybolognahennig.pdf.Search in Google Scholar

Hubert, L. and P. Arabie. 1985. “Comparing Partitions.” Journal of Classification 2:193–218, https://doi.org/10.1007/BF01908075.10.1007/BF01908075Search in Google Scholar

Lucey, P., A. Bialkowski, P. Carr, Y. Yue, and I. Matthews. 2014. “How to get an Open Shot: Analyzing Team Movement in Basketball using Tracking Data”, in MIT Sloan Sports Analytics Conference, MITSSAC. Boston, USA, http://www.sloansportsconference.com/wp-content/uploads/2014/02/2014-SSAC-How-to-Get-an-Open-Shot.pdf.Search in Google Scholar

McNicholas, P. D. 2016. “Model-Based Clustering.” Journal of Classification 33:331–373, https://doi.org/10.1007/s00357-016-9211-9.10.1007/s00357-016-9211-9Search in Google Scholar

Miller, A. C. and L. Bornn. 2017. “Possession Sketches: Mapping NBA Strategies.” MIT Sloan Sports Analytics Conference, http://www.lukebornn.com/papers/miller_ssac_2017.pdf.Search in Google Scholar

Pedregosa, F., G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. “Scikit-Learn: Machine Learning in Python.” Journal of Machine Learning Research 12:2825–2830.Search in Google Scholar

Raftery, A. E. and N. Dean. 2006. “Variable Selection for Model-Based Clustering.” Journal of the American Statistical Association 101:168–178, https://doi.org/10.1198/016214506000000113.10.1198/016214506000000113Search in Google Scholar

Rand, W. M. 1971. “Objective Criteria for the Evaluation of Clustering Methods.” Journal of the American Statistical Association 66:846–850.10.1080/01621459.1971.10482356Search in Google Scholar

Steinley, D. 2004. “Properties of the Hubert-Arabie Adjusted Rand Index.” Psychological Methods 9:386–396.10.1037/1082-989X.9.3.386Search in Google Scholar PubMed

Published Online: 2020-05-29
Published in Print: 2020-06-25

©2020 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 26.4.2024 from https://www.degruyter.com/document/doi/10.1515/jqas-2020-0017/html
Scroll to top button