Abstract
Automated approaches to analyze sports video content have been heavily explored in the last few decades to develop more informative and effective solutions for replay detection, shot classification, key-events detection, and summarization. Shot transition detection and classification are commonly applied to perform temporal segmentation for video content analysis. Accurate shot classification is an indispensable requirement to precisely detect the key-events and generate more informative summaries of the sports videos. The current state-of-the-art have several limitations, i.e., use of inflexible game-specific rule-based approaches, high computational cost, dependency on editing effects, game structure, and camera variations, etc. In this paper, we propose an effective decision tree architecture for shot classification of field sports videos to address the aforementioned issues. For this purpose, we employ the combination of low-, mid-, and high-level features to develop an interpretable and computationally efficient decision tree framework for shot classification. Rule-based induction is applied to create various rules using the decision tree to classify the video shots into long, medium, close-up, and out-of-field shots. One of the significant contributions of the proposed work is to find the most reliable rules that are least unpredictable for shot classification. The proposed shot classification method is robust to variations in camera, illumination conditions, game structure, video length, sports genre, broadcasters, etc. Performance of our method is evaluated on YouTube dataset of three different genre of sports that is diverse in terms of length, quantity, broadcasters, camera variations, editing effects and illumination conditions. The proposed method provides superior shot classification performance and achieves an average improvement of 6.9% in precision and 9.1% in recall as compared to contemporary methods under above-mentioned limitations.
Similar content being viewed by others
References
2018 FIFA World Cup Russia (2019). https://www.fifa.com/worldcup/news/more-than-half-the-world-watched-record-breaking-2018-world-cup. Accessed 16 Aug 2019
ICC Mens Cricket World Cup (2019). https://www.icc-cricket.com/media-releases/1277987. Accessed 16 Aug 2019
Merler M, Mac K, Joshi D, Nguyen Q, Hammer S, Kent J, Feris R (2018) Automatic curation of sports highlights using multimodal excitement features. IEEE Trans Multimed 21(5):1147–1160
Daudpota S, Muhammad A, Baber J (2019) Video genre identification using clustering-based shot detection algorithm. Signal Image Video Process. https://doi.org/10.1007/s11760-019-01488-3
Xiong B, Kalantidis Y, Ghadiyaram D, Grauman K (2019) Less is more: learning highlight detection from video duration. In: 2019 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 1258–1267
Fani M, Yazdi M, Clausi DA, Wong A (2017) Soccer video structure analysis by parallel feature fusion network and hidden-to-observable transferring Markov model. IEEE Access 5:27322–27336
Tavassolipour M, Karimian M, Kasaei S (2014) Event detection and summarization in soccer videos using Bayesian network and copula. IEEE Trans Circuits Syst Video Technol 24(2):291–304
Kolekar MH, Sengupta S (2015) Bayesian network-based customized highlight generation for broadcast soccer videos. IEEE Trans Broadcast 61(2):195–209
Javed A, Irtaza A, Khaliq Y, Malik H, Mahmood MT (2019) Replay and key-events detection for sports video summarization using confined elliptical local ternary patterns and extreme learning machine. Appl Intell 49(8):2899–2917
Javed A, Bajwa KB, Malik H, Irtaza A (2016) An efficient framework for automatic highlights generation from sports videos. IEEE Signal Process Lett 23(7):954–958
Javed A, Irtaza A, Malik H, Mahmood MT, Adnan S (2019) Multimodal framework based on audio-visual features for summarisation of cricket videos. IET Image Process 13(4):615–622
Dong N, Xing E (2018). Few-shot semantic segmentation with prototype learning. In: 2018 Proceedings of the British Machine Vision Conference (BMVC), p 6
Fan J, Zhou S, Siddique MA (2017) Fuzzy color distribution chart-based shot boundary detection. Multimed Tools Appl 76(7):10169–10190
Zabih R, Miller J, Mai K (1995) Feature-based algorithms for detecting and classifying scene breaks. Cornell University, New York
Ekin A, Tekalp AM, Mehrotra R (2003) Automatic soccer video analysis and summarization. IEEE Trans Image Process 12(7):796–807
Tien MC, Chen HT, Chen YW, Hsiao MH, Lee SY (2007) Shot classification of basketball videos and its application in shooting position extraction. In: 2007 Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 1085–1088
Raventos A, Quijada R, Torres L, Tarrs F (2015) Automatic summarization of soccer highlights using audio-visual descriptors. SpringerPlus 4(1):301
Choros K (2017) Automatic playing field detection and dominant color extraction in sports video shots of different view types. Multimed Netw Inf Syst 506:39–48
Wang DH, Tian Q, Gao S, Sung WK (2004) News sports video shot classification with sports play field and motion features. In: 2004 Proceedings of the IEEE Conference on Image Processing (ICIP). IEEE, pp 2247–2250
Jiang H, Zhang M (2011) Tennis video shot classification based on support vector machine. In: 2011 Proceedings of IEEE International Conference on Computer Science and Automation Engineering (CSAE). IEEE, pp 757–761
Kapela R, McGuinness K, OConnor NE (2017) Real-time field sports scene classification using colour and frequency space decompositions. J Real Time Image Process 13(4):725–737
Bagheri-Khaligh A, Raziperchikolaei R, Moghaddam ME (2012) A new method for shot classification in soccer sports video based on SVM classifier. In: 2012 Proceedings of the IEEE Southwest Symposium on Image Analysis and Interpretation (SSIAI). IEEE, pp 109–112
Pei SC, Chen F (2003) Semantic scenes detection and classification in sports videos. In: 2003 Proceedings of IPPR Conference on Computer Vision, Graphics and Image Processing (CVGIP), pp 210–217
Chen SC, Shyu ML, Zhang C, Luo L, Chen M (2003) Detection of soccer goal shots using joint multimedia features and classification rules. In: 2003 Proceedings of ACM International Workshop on Multimedia Data Mining and Knowledge Discovery (MDM/KDD). ACM, p 27
Javed A, Bajwa KB, Malik H, Irtaza A, Mahmood MT (2016) A hybrid approach for summarization of cricket videos. In: 2016 Proceedings of IEEE International Conference on Consumer Electronics-Asia (ICCE- Asia). IEEE, pp 1–4
Manickam A, Devarasan E, Manogaran G, Priyan MK, Varatharajan R, Hsu CH, Krishnamoorthi R (2018) Score level based latent fingerprint enhancement and matching using SIFT feature. Multimed Tools Appl 78(3):3065–3085
Liu G, Wen X, Zheng W, He P (2009) Shot boundary detection and keyframe extraction based on scale invariant feature transform. In: 2009 Proceedings of Eighth IEEE/ACIS International Conference on Computer and Information Science (ICIS). IEEE, pp 1126–1130
Stein M, Janetzko H, Lamprecht A, Breitkreutz T, Zimmermann P, Goldlcke B, Schreck T, Andrienko G, Grossniklaus M, Keim DA (2018) Bring it to the pitch: combining video and movement data to enhance team sport analysis. IEEE Trans Vis Comput Graph. 24(1):13–22
Jian M, Yin Y, Dong J (2018) Relative flow estimates for shot boundary detection. Pattern Recognit Image Anal 28(1):53–58
Deepak CR, Babu RU, Kumar KB, Krishnan CR (2013) Shot boundary detection using color correlogram and Gauge-SURF descriptors. In: 2013 Proceedings of IEEE International Conference on Computing, Communications and Networking Technologies (ICCCNT). IEEE, pp 1–5
Coldefy F, Bouthemy P, Betser M, Gravier G (2004) Tennis video abstraction from audio and visual cues. In: 2004 Proceedings of IEEE International Conference on Multimedia Signal Processing (MSP). IEEE, pp 163–166
Kim W, Moon SW, Lee J, Nam DW, Jung C (2018) Multiple player tracking in soccer videos: an adaptive multiscale sampling approach. Multimed Syst 24(6):611–623
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 886–893
Ali G, Iqbal MA, Choi TS (2016) Boosted NNE collections for multicultural facial expression recognition. Pattern Recognit 55:14–27
Xiao B, Huigang Z, Zhou J (2014) VHR object detection based on structural feature extraction and query expansion. IEEE Trans Geosci Remote Sens 52(10):6508–6520
Wang Z, Wang K, Yang F, Pan S, Han Y (2018) Image segmentation of overlapping leaves based on ChanVese model and Sobel operator. Inf Process Agric 5(1):1–10
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: 2001 Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp. 511–518
Javed A, Malik H, Bajwa K, Irtaza A (2017) Mahmood MT Data from: replay detection framework for automatic highlights generation from sports videos. Dryad Digital Repository. https://doi.org/10.5061/dryad.5b880
Chandrasekar P, Qian K, Shahriar H, Bhattacharya P (2017) Improving the prediction accuracy of decision tree mining with data preprocessing. In: 2017 Proceedings of 41st IEEE Annual Computer Software and Applications Conference (COMPSAC). IEEE, pp 481–484
Jiang H, Lu Y, Xue J (2016) Automatic soccer video event detection based on a deep neural network combined CNN and RNN. In: 2016 Proceedings of 28th IEEE International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, pp 490–494
Acknowledgements
This work was supported and funded by the Directorate ASRTD of University of Engineering and Technology-Taxila (UET/ASRTD/RG-1002-3).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Javed, A., Malik, K.M., Irtaza, A. et al. A decision tree framework for shot classification of field sports videos. J Supercomput 76, 7242–7267 (2020). https://doi.org/10.1007/s11227-020-03155-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-020-03155-8