skip to main content
research-article

Topic-based Video Analysis: A Survey

Published:13 July 2021Publication History
Skip Abstract Section

Abstract

Manual processing of a large volume of video data captured through closed-circuit television is challenging due to various reasons. First, manual analysis is highly time-consuming. Moreover, as surveillance videos are recorded in dynamic conditions such as in the presence of camera motion, varying illumination, or occlusion, conventional supervised learning may not work always. Thus, computer vision-based automatic surveillance scene analysis is carried out in unsupervised ways. Topic modelling is one of the emerging fields used in unsupervised information processing. Topic modelling is used in text analysis, computer vision applications, and other areas involving spatio-temporal data. In this article, we discuss the scope, variations, and applications of topic modelling, particularly focusing on surveillance video analysis. We have provided a methodological survey on existing topic models, their features, underlying representations, characterization, and applications in visual surveillance’s perspective. Important research papers related to topic modelling in visual surveillance have been summarized and critically analyzed in this article.

References

  1. Parvin Ahmadi, Iman Gholampour, and Mahmoud Tabandeh. 2017. A new two-stage topic model-based framework for modeling traffic motion patterns. In Proceedings of the Iranian Conference on Machine Vision and Image Processing. IEEE, 276–280.Google ScholarGoogle ScholarCross RefCross Ref
  2. Parvin Ahmadi, Iman Gholampour, and Mahmoud Tabandeh. 2018. Employing topical relations in semantic analysis of traffic videos. IEEE Intell. Syst. 34, 1 (2018), 3–13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. John Aitchison and C. H. Ho. 1989. The multivariate poisson-log normal distribution. Biometrika 76, 4 (1989), 643–653.Google ScholarGoogle ScholarCross RefCross Ref
  4. Manal Al Ghamdi and Yoshihiko Gotoh. 2020. Graph-based topic models for trajectory clustering in crowd videos. Mach. Vision Appl. 31, 5 (2020), 1–13.Google ScholarGoogle Scholar
  5. Rubayyi Alghamdi and Khalid Alfalqi. 2015. A survey of topic modeling in text mining. Int. J. Adv. Comput. Sci. Appl. 6, 1 (2015), 147–153.Google ScholarGoogle Scholar
  6. Saad Ali and Mubarak Shah. 2007. A lagrangian particle dynamics approach for crowd flow segmentation and stability analysis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1–6.Google ScholarGoogle ScholarCross RefCross Ref
  7. Vahid Bastani, Lucio Marcenaro, and Carlo S. Regazzoni. 2016. Online nonparametric bayesian activity mining and analysis from surveillance video. IEEE Trans. Image Process. 25, 5 (2016), 2089–2102.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ben Benfold and Ian Reid. 2011. Stable multi-target tracking in real-time surveillance video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3457–3464.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. David Blei, Lawrence Carin, and David Dunson. 2010. Probabilistic topic models. IEEE Signal Process. Mag. 27, 6 (2010), 55–65.Google ScholarGoogle ScholarCross RefCross Ref
  10. David M. Blei. 2012. Probabilistic topic models. Commun. ACM 55, 4 (2012), 77–84.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. David M. Blei and John D. Lafferty. 2006. Dynamic topic models. In Proceedings of the International Conference on Machine learning. ACM, 113–120.Google ScholarGoogle Scholar
  12. David M. Blei and John D. Lafferty. 2007. A correlated topic model of science. Ann. Appl. Stat. (2007), 17–35.Google ScholarGoogle Scholar
  13. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2002. Latent dirichlet allocation. In Advances in Neural Information Processing Systems. MIT Press, 601–608.Google ScholarGoogle Scholar
  14. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan.2003), 993–1022.Google ScholarGoogle Scholar
  15. Jordan L. Boyd-Graber and David M. Blei. 2009. Syntactic topic models. In Advances in Neural Information Processing Systems. MIT Press, 185–192.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Xiao-Qin Cao and Zhi-Qiang Liu. 2015. Type-2 fuzzy topic models for human action recognition. IEEE Trans. Fuzzy Syst. 23, 5 (2015), 1581–1593.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Ziqiang Cao, Sujian Li, Yang Liu, Wenjie Li, and Heng Ji. 2015. A novel neural topic model and its supervised extension. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  18. Baitong Chen, Satoshi Tsutsui, Ying Ding, and Feicheng Ma. 2017. Understanding the topic evolution in a scientific domain: An exploratory study for the field of information retrieval. J. Informetr. 11, 4 (2017), 1175–1189.Google ScholarGoogle ScholarCross RefCross Ref
  19. Qian Chen, Ni Ai, Jie Liao, Xin Shao, Yufeng Liu, and Xiaohui Fan. 2017. Revealing topics and their evolution in biomedical literature using Bio-DTM: A case study of ginseng. Chinese Med. 12, 1 (2017), 27.Google ScholarGoogle ScholarCross RefCross Ref
  20. Shizhe Chen, Jia Chen, and Qin Jin. 2017. Generating video descriptions with topic guidance. In Proceedings of the ACM International Conference on Multimedia Retrieval. 5–13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Shizhe Chen, Qin Jin, Jia Chen, and Alexander G. Hauptmann. 2019. Generating video descriptions with latent topic guidance. IEEE Trans. Multimedia 21, 9 (2019), 2407–2418.Google ScholarGoogle ScholarCross RefCross Ref
  22. Yu Chen, Tom Diethe, and Peter Flach. 2016. ™: A topic model for discovery of activities of daily living in a smart home. In Proceedings of the IJCAI. 1404–1410.Google ScholarGoogle Scholar
  23. Yuhao Chen, Ming Yang, Chunxiang Wang, and Bing Wang. 2019. 3D semantic modelling with label correction for extensive outdoor scene. In Proceedings of the IEEE Intelligent Vehicles Symposium. IEEE, 1262–1267.Google ScholarGoogle ScholarCross RefCross Ref
  24. Jen-Tzung Chien and Meng-Sung Wu. 2008. Adaptive bayesian latent semantic analysis. IEEE Trans. Audio, Speech, Language Process. 16, 1 (2008), 198–207.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Ayesha Choudhary, Manish Pal, Subhashis Banerjee, and Santanu Chaudhury. 2008. Unusual activity analysis using video epitomes and pLSA. In Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing. IEEE, 390–397.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Pradipto Das, Chenliang Xu, Richard F. Doell, and Jason J. Corso. 2013. A thousand frames in just a few words: Lingual description of videos through latent topics and sparse object stitching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2634–2641.Google ScholarGoogle Scholar
  27. Sokemi Rene Emmanuel Datondji, Yohan Dupuis, Peggy Subirats, and Pascal Vasseur. 2016. A survey of vision-based traffic monitoring of road intersections. IEEE Trans. Intell. Transport. Syst. 17, 10 (2016), 2681–2698.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. N. A. Deepak and U. N. Sinha. 2016. Analysis of human gait for person identification and human action recognition. Analysis 4, 4 (2016).Google ScholarGoogle Scholar
  29. Mohamed Dermouche, Julien Velcin, Leila Khouas, and Sabine Loudcher. 2014. A joint model for topic-sentiment evolution over time. In Proceedings of the IEEE International Conference on Data Mining. IEEE, 773–778.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Adji B. Dieng, Chong Wang, Jianfeng Gao, and John Paisley. 2016. Topicrnn: A recurrent neural network with long-range semantic dependency. Retrieved from https://arXiv preprint arXiv:1611.01702.Google ScholarGoogle Scholar
  31. M. Divya, K. Thendral, and S. Chitrakala. 2013. A survey on topic modeling. Int. J. Recent Adv. Eng. Technol. 1 (2013), 57–61.Google ScholarGoogle Scholar
  32. Yinpeng Dong, Hang Su, Jun Zhu, and Bo Zhang. 2017. Improving interpretability of deep neural networks with semantic information. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4306–4314.Google ScholarGoogle ScholarCross RefCross Ref
  33. Ke Du, Ying Shi, Bowen Lei, Jie Chen, and Mingjun Sun. 2016. A method of human action recognition based on spatio-temporal interest points and PLSA. In Proceedings of the International Conference on Industrial Informatics-Computing Technology, Intelligent Technology, Industrial Information Integration. IEEE, 69–72.Google ScholarGoogle ScholarCross RefCross Ref
  34. Liang Du, Haitao Lang, Ying-Li Tian, Chiu C. Tan, Jie Wu, and Haibin Ling. 2016. Covert video classification by codebook growing pattern. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 11–18.Google ScholarGoogle ScholarCross RefCross Ref
  35. Paul Duckworth, Muhannad Al-Omari, James Charles, David C. Hogg, and Anthony G. Cohn. 2017. Latent dirichlet allocation for unsupervised activity analysis on an autonomous mobile robot. In Proceedings of the AAAI Conference on Artificial Intelligence. 3819–3826.Google ScholarGoogle Scholar
  36. Yawen Fan, Quan Zhou, Wenjing Yue, and Weiping Zhu. 2017. A dynamic causal topic model for mining activities from complex videos. Multimedia Tools Appl. (2017), 1–16.Google ScholarGoogle Scholar
  37. Li Fei-Fei and Pietro Perona. 2005. A bayesian hierarchical model for learning natural scene categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2. IEEE, 524–531.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Jiangfan Feng and Amin Fu. 2018. Scene semantic recognition based on probability topic model. Information 9, 4 (2018), 97.Google ScholarGoogle ScholarCross RefCross Ref
  39. Yanwei Fu, Timothy M. Hospedales, Tao Xiang, and Shaogang Gong. 2013. Learning multimodal latent attributes. IEEE Trans. Pattern Anal. Mach. Intell. 36, 2 (2013), 303–316.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Yanwei Fu, Timothy M. Hospedales, Tao Xiang, and Shaogang Gong. 2014. Learning multimodal latent attributes. IEEE Trans. Pattern Anal. Mach. Intell. 36, 2 (2014), 303–316.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Laya Elsa George and Lokendra Birla. 2018. A study of topic modeling methods. In Proceedings of the 2nd International Conference on Intelligent Computing and Control Systems (ICICCS’18). IEEE, 109–113.Google ScholarGoogle ScholarCross RefCross Ref
  42. Yogesh Girdhar, Walter Cho, Matthew Campbell, Jesus Pineda, Elizabeth Clarke, and Hanumant Singh. 2016. Anomaly detection in unstructured environments using bayesian nonparametric scene modeling. In Proceedings of the IEEE International Conference on Robotics and Automation. IEEE, 2651–2656.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Shaogang Gong and Tao Xiang. 2011. Visual analysis of behaviour: From pixels to semantics. Springer Science & Business Media.Google ScholarGoogle Scholar
  44. Lena Gorelick, Moshe Blank, Eli Shechtman, Michal Irani, and Ronen Basri. 2007. Actions as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29, 12 (2007), 2247–2253.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Tom Griffiths. 2002. Gibbs sampling in the generative model of latent dirichlet allocation. Citesee.Google ScholarGoogle Scholar
  46. Amit Gruber, Yair Weiss, and Michal Rosen-Zvi. 2007. Hidden topic Markov models. In Artificial Intelligence and Statistics. Springer, 163–170.Google ScholarGoogle Scholar
  47. Adrien Guille and Edmundo-Pavel Soriano-Morales. 2016. TOM: A library for topic modeling and browsing. In Proceedings of the European Grid Conference (EGC’16). 451–456.Google ScholarGoogle Scholar
  48. Amirhossein Habibian, Thomas Mensink, and Cees G. M. Snoek. 2014. Videostory: A new multimedia embedding for few-example recognition and translation of events. In Proceedings of the ACM International Conference on Multimedia. 17–26.Google ScholarGoogle Scholar
  49. Avishai Hendel, Daphna Weinshall, and Shmuel Peleg. 2010. Identifying surprising events in videos using bayesian topic models. In Proceedings of the Asian Conference on Computer Vision. Springer, 448–459.Google ScholarGoogle Scholar
  50. Yan Heng, Zhifeng Gao, Yuan Jiang, and Xuqi Chen. 2018. Exploring hidden factors behind online food shopping from Amazon reviews: A topic mining approach. J. Retail. Consumer Services 42 (2018), 161–168.Google ScholarGoogle ScholarCross RefCross Ref
  51. Joao F. Henriques, Rui Caseiro, Pedro Martins, and Jorge Batista. 2015. High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37, 3 (2015), 583–596.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Geoffrey E. Hinton and Ruslan R. Salakhutdinov. 2009. Replicated softmax: An undirected topic model. In Advances in Neural Information Processing Systems. MIT Press, 1607–1614.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Matthew Hoffman, Francis R. Bach, and David M. Blei. 2010. Online learning for latent dirichlet allocation. In Advances in Neural Information Processing Systems. MIT Press, 856–864.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Thomas Hofmann. 1999. Probabilistic latent semantic indexing. In Proceedings of the International ACM Conference on Research and Development in Information Retrieval. ACM, 50–57.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Thomas Hofmann. 2001. Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42, 1 (2001), 177–196.Google ScholarGoogle ScholarCross RefCross Ref
  56. Timothy Hospedales, Shaogang Gong, and Tao Xiang. 2009. A Markov clustering topic model for mining behaviour in video. In Proceedings of the International Conference on Computer Vision. IEEE, 1165–1172.Google ScholarGoogle ScholarCross RefCross Ref
  57. Timothy Hospedales, Shaogang Gong, and Tao Xiang. 2012. Video behaviour mining using a dynamic topic model. Int. J. Comput. Vision 98, 3 (2012), 303–323.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Timothy M. Hospedales, Jian Li, Shaogang Gong, and Tao Xiang. 2011. Identifying rare and subtle behaviors: A weakly supervised joint topic model. IEEE Trans. Pattern Anal. Mach. Intell. 33, 12 (2011), 2451–2464.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Sujuan Hou, Ling Chen, Dacheng Tao, Shangbo Zhou, Wenjie Liu, and Yuanjie Zheng. 2017. Multi-layer multi-view topic model for classifying advertising video. Pattern Recogn. 68 (2017), 66–81.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Chun-Hao Huang, Edmond Boyer, Nassir Navab, and Slobodan Ilic. 2014. Human shape and pose tracking using keyframes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3446–3453.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Michael C. Hughes. 2010. Supervised topic models for video activity recognition. Unpublished manuscript.Google ScholarGoogle Scholar
  62. Jing Huo, Yang Gao, Yinghuan Shi, and Hujun Yin. 2018. Cross-modal metric learning for auc optimization. IEEE Trans. Neural Netw. Learn. Syst. 29, 10 (2018), 4844–4856.Google ScholarGoogle ScholarCross RefCross Ref
  63. Tâm Huynh, Mario Fritz, and Bernt Schiele. 2008. Discovery of activity patterns using topic models. In Proceedings of the International Conference on Ubiquitous Computing. ACM, 10–19.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Olga Isupova, Danil Kuzin, and Lyudmila Mihaylova. 2015. Abnormal behaviour detection in video using topic modeling. In Proceedings of the USES Conference. The University of Sheffield.Google ScholarGoogle Scholar
  65. Olga Isupova, Danil Kuzin, and Lyudmila Mihaylova. 2018. Learning methods for dynamic topic modeling in automated behavior analysis. IEEE Trans. Neural Netw. Learn. Syst. 29, 9 (2018), 3980–3993.Google ScholarGoogle ScholarCross RefCross Ref
  66. Olga Isupova, Lyudmila Mihaylova, Danil Kuzin, Garik Markarian, and Francois Septier. 2015. An expectation maximisation algorithm for behaviour analysis in video. In Proceedings of the International Conference on Information Fusion. IEEE, 126–133.Google ScholarGoogle Scholar
  67. Rahul Radhakrishnan Iyer, Sanjeel Parekh, Vikas Mohandoss, Anush Ramsurat, Bhiksha Raj, and Rita Singh. 2016. Content-based video indexing and retrieval using corr-lda. Retrieved from https://arXiv preprint arXiv:1602.08581.Google ScholarGoogle Scholar
  68. Anil K. Jain. 2010. Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31, 8 (2010), 651–666.Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. V. Jelisavčić, Bojan Furlan, Jelica Protić, and Veljko Milutinović. 2012. Topic models and advanced algorithms for profiling of knowledge in scientific papers. In Proceedings of the International Convention MIPRO. IEEE, 1030–1035.Google ScholarGoogle Scholar
  70. Hamed Jelodar, Yongli Wang, Chi Yuan, Xia Feng, Xiahui Jiang, Yanchao Li, and Liang Zhao. 2019. Latent dirichlet allocation (LDA) and topic modeling: Models, applications, a survey. Multimedia Tools Appl. 78, 11 (2019), 15169–15211.Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Hawook Jeong, Youngjoon Yoo, Kwang Moo Yi, and Jin Young Choi. 2014. Two-stage online inference model for traffic pattern analysis and anomaly detection. Mach. Vision Appl. 25, 6 (2014), 1501–1517.Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Longlong Jing and Yingli Tian. 2020. Self-supervised visual feature learning with deep neural networks: A survey. IEEE Trans. Pattern Anal. Mach. Intell. (2020), 1–1. DOI:10.1109/TPAMI.2020.2992393Google ScholarGoogle ScholarCross RefCross Ref
  73. Peiguang Jing, Yuting Su, Liqiang Nie, Xu Bai, Jing Liu, and Meng Wang. 2017. Low-rank multi-view embedding learning for micro-video popularity prediction. IEEE Trans. Knowl. Data Eng. 30, 8 (2017), 1519–1532.Google ScholarGoogle ScholarCross RefCross Ref
  74. Michael I. Jordan. 2010. Bayesian nonparametric learning: Expressive priors for intelligent systems. Heuristics, Probabil. Causal.: Trib. Judea Pearl 11 (2010), 167–185.Google ScholarGoogle Scholar
  75. Arnold Kalmbach, Maia Hoeberechts, Alexandra Branzan Albu, Hervé Glotin, Sébastien Paris, and Yogesh Girdhar. 2016. Learning deep-sea substrate types with visual topic models. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision. IEEE, 1–9.Google ScholarGoogle ScholarCross RefCross Ref
  76. Zenun Kastrati, Ali Shariq Imran, and Arianit Kurti. 2019. Integrating word embeddings and document topics with deep learning in a video classification framework. Pattern Recogn. Lett. 128 (2019), 85–92.Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Hirokatsu Kataoka, Yoshimitsu Aoki, Kenji Iwata, and Yutaka Satoh. 2015. Evaluation of vision-based human activity recognition in dense trajectory framework. In Proceedings of the International Symposium on Visual Computing. Springer, 634–646.Google ScholarGoogle ScholarCross RefCross Ref
  78. Hirokatsu Kataokai, Kenji Iwata, Yutaka Satoh, Masaki Hayashi, Yoshimitsu Aok, and Slobodan Ilic. 2016. Dominant codewords selection with topic model for action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 65–72.Google ScholarGoogle ScholarCross RefCross Ref
  79. Angelos Katharopoulos, Despoina Paschalidou, Christos Diou, and Anastasios Delopoulos. 2016. Fast supervised lda for discovering micro-events in large-scale video datasets. In Proceedings of the ACM International Conference on Multimedia. 332–336.Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. Sayed Hossein Khatoonabadi and Ivan V. Bajic. 2013. Video object tracking in the compressed domain using spatio-temporal Markov random fields. IEEE Trans. Image Process. 22, 1 (2013), 300–313.Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. Jaechul Kim and Kristen Grauman. 2009. Observe locally, infer globally: A space-time MRF for detecting abnormal activities with incremental updates. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2921–2928.Google ScholarGoogle ScholarCross RefCross Ref
  82. Patrik Ehrencrona Kjellin and Yan Liu. 2016. A survey on interactivity in topic models. Int. J. Adv. Comput. Sci. Appl. 7, 4 (2016), 456–461.Google ScholarGoogle Scholar
  83. Santhosh Kelathodi Kumaran, Adyasha Chakravarty, Debi Prosad Dogra, and Partha Pratim Roy. 2019. Likelihood learning in modified dirichlet process mixture model for video analysis. Pattern Recogn. Lett. 128 (2019), 211–219.Google ScholarGoogle ScholarCross RefCross Ref
  84. Lakhdar Laib, Mohand Said Allili, and Samy Ait-Aoudia. 2019. A probabilistic topic model for event-based image classification and multi-label annotation. Signal Process.: Image Commun. 76 (2019), 283–294.Google ScholarGoogle ScholarCross RefCross Ref
  85. Jey Han Lau, David Newman, and Timothy Baldwin. 2014. Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality. In Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics. 530–539.Google ScholarGoogle ScholarCross RefCross Ref
  86. Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In Proceedings of the International Conference on Machine Learning. 1188–1196.Google ScholarGoogle Scholar
  87. Sangno Lee, Jeff Baker, Jaeki Song, and James C. Wetherbe. 2010. An empirical comparison of four text mining methods. In Proceedings of the International Conference on System Sciences. IEEE, 1–10.Google ScholarGoogle Scholar
  88. Haojie Li, Lijuan Liu, Fuming Sun, Yu Bao, and Chenxin Liu. 2016. Multi-level feature representations for video semantic concept detection. Neurocomputing 172 (2016), 64–70.Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. Jian Li, Shaogang Gong, and Tao Xiang. 2008. Global behaviour inference using probabilistic latent semantic analysis. In Proceedings of the British Machine Vision Conference, Vol. 3231. 3232.Google ScholarGoogle ScholarCross RefCross Ref
  90. Jian Li, Shaogang Gong, and Tao Xiang. 2012. Learning behavioural context. Int. J. Comput. Vision 97, 3 (2012), 276–304.Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. Jian Li, Timothy M. Hospedales, Shaogang Gong, and Tao Xiang. 2010. Learning rare behaviours. In Proceedings of the Asian Conference on Computer Vision. Springer, 293–307.Google ScholarGoogle Scholar
  92. Li-Jia Li and Li Fei-Fei. 2007. What, where and who? classifying events by scene and object recognition. In Proceedings of the International Conference on Computer Vision. IEEE, 1–8.Google ScholarGoogle ScholarCross RefCross Ref
  93. Teng Li, Huan Chang, Meng Wang, Bingbing Ni, Richang Hong, and Shuicheng Yan. 2015. Crowded scene analysis: A survey. IEEE Trans. Circ. Syst. Video Technol. 25, 3 (2015), 367–386.Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. Wentong Liao, Bodo Rosenhahn, and Machael Yang. 2015. Video event recognition by combining HDP and gaussian process. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 19–27.Google ScholarGoogle ScholarDigital LibraryDigital Library
  95. Chih-Ching Lin, Shwu-Huey Yen, and Ching-Ting Tu. 2017. Visual object tracking via LDA. In Proceedings of the International Conference on Applied System Innovation. IEEE, 315–318.Google ScholarGoogle ScholarCross RefCross Ref
  96. Lu Lu, Zhan Yi-Ju, Jiang Qing, and Cai Qing-Ling. 2017. Recognizing human actions by two-level Beta process hidden Markov model. Multimedia Syst. 23, 2 (2017), 183–194.Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. Wenhan Luo, Björn Stenger, Xiaowei Zhao, and Tae-Kyun Kim. 2015. Automatic topic discovery for multi-object tracking. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  98. Guangyi Lv, Tong Xu, Enhong Chen, Qi Liu, and Yi Zheng. 2016. Reading the videos: Temporal labeling for crowdsourced time-sync videos based on semantic embedding. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  99. Xudong Mao, Qing Li, Haoran Xie, Raymond Y. K. Lau, Zhen Wang, and Stephen Paul Smolley. 2017. Least squares generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision. 2794–2802.Google ScholarGoogle ScholarCross RefCross Ref
  100. Antoine Miech, Dimitri Zhukov, Jean-Baptiste Alayrac, Makarand Tapaswi, Ivan Laptev, and Josef Sivic. 2019. Howto100m: Learning a text-video embedding by watching hundred million narrated video clips. In Proceedings of the IEEE International Conference on Computer Vision. 2630–2640.Google ScholarGoogle ScholarCross RefCross Ref
  101. Arjan Mieremet, Ivo Alberink, Bart Hoogeboom, and Derk Vrijdag. 2018. Probability intervals of speed estimations from video images: The Markov chain monte carlo approach. Forens. Sci. Int. 288 (2018), 29–35.Google ScholarGoogle ScholarCross RefCross Ref
  102. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. Retrieved from https://arXiv preprint arXiv:1301.3781.Google ScholarGoogle Scholar
  103. Samaneh Moghaddam and Martin Ester. 2012. On the design of LDA models for aspect-based opinion mining. In Proceedings of the ACM International Conference on Information and Knowledge Management. ACM, 803–812.Google ScholarGoogle Scholar
  104. Brendan Tran Morris and Mohan Trivedi. 2013. Understanding vehicular traffic behavior from video: A survey of unsupervised approaches. J. Electron. Imag. 22, 4 (2013), 041113.Google ScholarGoogle ScholarCross RefCross Ref
  105. T Nathan Mundhenk, Daniel Ho, and Barry Y. Chen. 2018. Improvements to context-based self-supervised learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9339–9348.Google ScholarGoogle Scholar
  106. Shi-Yong Neo, Yuanyuan Ran, Hai-Kiat Goh, Yantao Zheng, Tat-Seng Chua, and Jintao Li. 2007. The use of topic evolution to help users browse and find answers in news video corpus. In Proceedings of the ACM International Conference on Multimedia. ACM, 198–207.Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. David Newman, Padhraic Smyth, Max Welling, and Arthur U. Asuncion. 2008. Distributed inference for latent dirichlet allocation. In Advances in Neural Information Processing Systems. MIT Press, 1081–1088.Google ScholarGoogle Scholar
  108. Juan Carlos Niebles, Hongcheng Wang, and Li Fei-Fei. 2008. Unsupervised learning of human action categories using spatial-temporal words. Int. J. Comput. Vision 79, 3 (2008), 299–318.Google ScholarGoogle ScholarDigital LibraryDigital Library
  109. Zhenxing Niu, Gang Hua, Le Wang, and Xinbo Gao. 2018. Knowledge-based topic model for unsupervised object discovery and localization. IEEE Trans. Image Process. 27, 1 (2018), 50–63.Google ScholarGoogle ScholarCross RefCross Ref
  110. Aytug Onan, Serdar Korukoglu, and Hasan Bulut. 2016. LDA-based topic modelling in text sentiment classification: An empirical analysis.Int. J. Comput. Linguist. Appl. 7, 1 (2016), 101–119.Google ScholarGoogle Scholar
  111. Yingwei Pan, Tao Mei, Ting Yao, Houqiang Li, and Yong Rui. 2016. Jointly modeling embedding and translation to bridge video and language. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4594–4602.Google ScholarGoogle ScholarCross RefCross Ref
  112. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311–318.Google ScholarGoogle Scholar
  113. Deepak Pathak, Abhijit Sharang, and Amitabha Mukerjee. 2015. Anomaly localization in topic-based analysis of surveillance videos. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision. IEEE, 389–395.Google ScholarGoogle ScholarDigital LibraryDigital Library
  114. Anastasia Podosinnikova, Francis Bach, and Simon Lacoste-Julien. 2015. Rethinking lda: Moment matching for discrete ICA. In Advances in Neural Information Processing Systems. MIT Press, 514–522.Google ScholarGoogle Scholar
  115. Oluwatoyin P. Popoola and Kejun Wang. 2012. Video-based abnormal human behavior recognition—A review. IEEE Trans. Systems, Man, Cybernet., Part C (Appl. Rev.) 42, 6 (2012), 865–878.Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. Ronald Poppe. 2010. A survey on vision-based human action recognition. Image Vision Comput. 28, 6 (2010), 976–990.Google ScholarGoogle ScholarDigital LibraryDigital Library
  117. Ian Porteous, David Newman, Alexander Ihler, Arthur Asuncion, Padhraic Smyth, and Max Welling. 2008. Fast collapsed gibbs sampling for latent dirichlet allocation. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining. ACM, 569–577.Google ScholarGoogle ScholarDigital LibraryDigital Library
  118. Anderson Rocha, Walter Scheirer, Terrance Boult, and Siome Goldenstein. 2011. Vision of the unseen: Current trends and challenges in digital image and video forensics. ACM Comput. Surveys 43, 4 (2011), 26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  119. Filipe Rodrigues, Mariana Lourenco, Bernardete Ribeiro, and Francisco C. Pereira. 2017. Learning supervised topic models for classification and regression from crowds. IEEE Trans. Pattern Anal. Mach. Intell. 39, 12 (2017), 2409–2422.Google ScholarGoogle ScholarCross RefCross Ref
  120. Mikel Rodriguez, Josef Sivic, Ivan Laptev, and Jean-Yves Audibert. 2011. Data-driven crowd analysis in videos. In Proceedings of the IEEE International Conference on Computer vision. IEEE, 1235–1242.Google ScholarGoogle ScholarDigital LibraryDigital Library
  121. Sergio Rodríguez-Pérez and Raul Montoliu. 2013. Bag-of-words and topic modeling-based sport video analysis. In Proceedings of the Iberian Conference on Pattern Recognition and Image Analysis. Springer, 189–196.Google ScholarGoogle ScholarCross RefCross Ref
  122. Marcus Rohrbach, Sikandar Amin, Mykhaylo Andriluka, and Bernt Schiele. 2012. A database for fine grained activity detection of cooking activities. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1194–1201.Google ScholarGoogle ScholarCross RefCross Ref
  123. Xin Rong. 2014. word2vec parameter learning explained. Retrieved from https://arXiv preprint arXiv:1411.2738.Google ScholarGoogle Scholar
  124. Lukas Rybok, Simon Friedberger, Uwe D. Hanebeck, and Rainer Stiefelhagen. 2011. The kit robo-kitchen data set for the evaluation of view-based activity recognition systems. In Proceedings of the IEEE-RAS International Conference on Humanoid Robots. IEEE, 128–133.Google ScholarGoogle ScholarCross RefCross Ref
  125. Imran Saleemi, Khurram Shafique, and Mubarak Shah. 2009. Probabilistic modeling of scene dynamics for applications in visual surveillance. IEEE Trans. Pattern Anal. Mach. Intell. 31, 8 (2009), 1472–1485.Google ScholarGoogle ScholarDigital LibraryDigital Library
  126. Juan C. SanMiguel, Andrea Cavallaro, and José M. Martínez. 2012. Adaptive online performance evaluation of video trackers. IEEE Trans. Image Process. 21, 5 (2012), 2812–2823.Google ScholarGoogle ScholarDigital LibraryDigital Library
  127. Kelathodi Kumaran Santhosh, Debi Prosad Dogra, and Partha Pratim Roy. 2018. Temporal unknown incremental clustering model for analysis of traffic surveillance videos. IEEE Trans. Intell. Transport. Syst. 20, 5 (2018), 1762–1773.Google ScholarGoogle ScholarCross RefCross Ref
  128. Kelathodi Kumaran Santhosh, Debi Prosad Dogra, Partha Pratim Roy, and Bidyut Baran Chaudhuri. 2019. Trajectory-based scene understanding using dirichlet process mixture model. IEEE Trans. Cybernet. (2019), 1–14. DOI:10.1109/TCYB.2019.2931139Google ScholarGoogle Scholar
  129. Christian Schuldt, Ivan Laptev, and Barbara Caputo. 2004. Recognizing human actions: A local SVM approach. In Proceedings of the International Conference on Pattern Recognition, Vol. 3. IEEE, 32–36.Google ScholarGoogle ScholarCross RefCross Ref
  130. Matthew W. Segar, Kershaw V. Patel, Colby Ayers, Mujeeb Basit, W. H. Wilson Tang, Duwayne Willett, Jarett Berry, Justin L. Grodin, and Ambarish Pandey. 2020. Phenomapping of patients with heart failure with preserved ejection fraction using machine learning-based unsupervised cluster analysis. Eur. J. Heart Fail. 22, 1 (2020), 148–158.Google ScholarGoogle ScholarCross RefCross Ref
  131. Giulia Slavic, Damian Campo, Mohamad Baydoun, Pablo Marin, David Martin, Lucio Marcenaro, and Carlo Regazzoni. 2020. Anomaly detection in video data based on probabilistic latent space models. In Proceedings of the IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS’20). IEEE, 1–8.Google ScholarGoogle ScholarCross RefCross Ref
  132. Angela A. Sodemann, Matthew P. Ross, and Brett J. Borghetti. 2012. A review of anomaly detection in automated surveillance. IEEE Trans. Syst. Man Cybernet., Part C (Appl. Rev.) 42, 6 (2012), 1257–1272.Google ScholarGoogle ScholarDigital LibraryDigital Library
  133. Berkan Solmaz, Brian E. Moore, and Mubarak Shah. 2012. Identifying behaviors in crowd scenes using stability analysis for dynamical systems. IEEE Trans. Pattern Anal. Mach. Intell. 34, 10 (2012), 2064–2070.Google ScholarGoogle ScholarDigital LibraryDigital Library
  134. Khurram Soomro, Amir Roshan Zamir, and M. Shah. 2012. A dataset of 101 human action classes from videos in the wild. Center Res. Comput. Vision 2, 11 (2012).Google ScholarGoogle Scholar
  135. Keith Stevens, Philip Kegelmeyer, David Andrzejewski, and David Buttler. 2012. Exploring topic coherence over many models and many topics. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 952–961.Google ScholarGoogle Scholar
  136. Xing Sun, Nelson H. C. Yung, Edmund Y. Lam, and Hayden K.-H. So. 2016. Unsupervised tracking with a low computational cost using the doubly stochastic Dirichlet process mixture model. Electron. Imag. 2016, 14 (2016), 1–8.Google ScholarGoogle Scholar
  137. Yee Whye Teh and Michael I. Jordan. 2010. Hierarchical Bayesian nonparametric models with applications. Bayesian Nonparametr. 1 (2010), 158–207.Google ScholarGoogle ScholarCross RefCross Ref
  138. Yee W. Teh, David Newman, and Max Welling. 2007. A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. In Advances in Neural Information Processing Systems. MIT Press, 1353–1360.Google ScholarGoogle Scholar
  139. Nguyen Anh Tu, Thien Huynh-The, Kifayat Ullah Khan, and Young-Koo Lee. 2018. ML-HDP: A hierarchical bayesian nonparametric model for recognizing human actions in video. IEEE Trans. Circ. Syst. Video Technol. 29, 3 (2018), 800–814.Google ScholarGoogle ScholarDigital LibraryDigital Library
  140. Jagannadan Varadarajan, Rémi Emonet, and Jean-Marc Odobez. 2013. A sequential topic model for mining recurrent activities from long term video logs. Int. J. Comput. Vision 103, 1 (2013), 100–126.Google ScholarGoogle ScholarCross RefCross Ref
  141. Jagannadan Varadarajan and Jean-Marc Odobez. 2009. Topic models for scene analysis and abnormality detection. In Proceedings of the IEEE International Conference on Computer Vision Workshops. IEEE, 1338–1345.Google ScholarGoogle ScholarCross RefCross Ref
  142. Chong Wang and David M. Blei. 2011. Collaborative topic modeling for recommending scientific articles. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining. ACM, 448–456.Google ScholarGoogle Scholar
  143. Chong Wang, John Paisley, and David Blei. 2011. Online variational inference for the hierarchical Dirichlet process. In Proceedings of the International Conference on Artificial Intelligence and Statistics. 752–760.Google ScholarGoogle Scholar
  144. He Wang and Carol O’Sullivan. 2016. Globally continuous and non-Markovian crowd activity analysis from videos. In Proceedings of the European Conference on Computer Vision. Springer, 527–544.Google ScholarGoogle ScholarCross RefCross Ref
  145. Heng Wang and Cordelia Schmid. 2013. Action recognition with improved trajectories. In Proceedings of the IEEE International Conference on Computer Vision. 3551–3558.Google ScholarGoogle ScholarDigital LibraryDigital Library
  146. Hongxing Wang, Gangqiang Zhao, and Junsong Yuan. 2014. Visual pattern discovery in image and video data: A brief survey. Wiley Interdisc. Rev.: Data Min. Knowl. Discov. 4, 1 (2014), 24–37.Google ScholarGoogle ScholarDigital LibraryDigital Library
  147. Jinqiao Wang, Wei Fu, Hanqing Lu, and Songde Ma. 2014. Bilayer sparse topic model for scene analysis in imbalanced surveillance videos. IEEE Trans. Image Process. 23, 12 (2014), 5198–5208.Google ScholarGoogle ScholarCross RefCross Ref
  148. Jun Wang, Limin Xia, Xiangjie Hu, and Yongliang Xiao. 2019. Abnormal event detection with semi-supervised sparse topic model. Neural Comput. Appl. 31, 5 (2019), 1607–1617.Google ScholarGoogle ScholarDigital LibraryDigital Library
  149. Le Wang, Gang Hua, Rahul Sukthankar, Jianru Xue, and Nanning Zheng. 2014. Video object discovery and co-segmentation with extremely weak supervision. In Proceedings of the European Conference on Computer Vision. Springer, 640–655.Google ScholarGoogle ScholarCross RefCross Ref
  150. Tingwei Wang and Chuancai Liu. 2013. Human action recognition using supervised pLSA. Int. J. Signal Process. Image Process. Pattern Recogn. 6, 4 (2013), 403–414.Google ScholarGoogle Scholar
  151. Wei Wang, Payam Mamaani Barnaghi, and Andrzej Bargiela. 2010. Probabilistic topic models for learning terminological ontologies. IEEE Trans. Knowl. Data Eng. 22, 7 (2010), 1028–1040.Google ScholarGoogle ScholarDigital LibraryDigital Library
  152. Xiaogang Wang, Keng Teck Ma, Gee-Wah Ng, and W. Eric L. Grimson. 2011. Trajectory analysis and semantic region modeling using nonparametric hierarchical bayesian models. Int. J. Comput. Vision 95, 3 (2011), 287–312.Google ScholarGoogle ScholarDigital LibraryDigital Library
  153. Xiaogang Wang, Xiaoxu Ma, and W. Eric L. Grimson. 2009. Unsupervised activity perception in crowded and complicated scenes using hierarchical bayesian models. IEEE Trans. Pattern Anal. Mach. Intell. 31, 3 (2009), 539–555.Google ScholarGoogle ScholarDigital LibraryDigital Library
  154. Xuerui Wang and Andrew McCallum. 2006. Topics over time: A non-Markov continuous-time model of topical trends. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining. ACM, 424–433.Google ScholarGoogle ScholarDigital LibraryDigital Library
  155. Yinying Wang, Alex J. Bowers, and David J. Fikis. 2017. Automated text data mining analysis of five decades of educational leadership research literature: Probabilistic topic modeling of EAQ articles from 1965 to 2014. Edu. Admin. Quart. 53, 2 (2017), 289–323.Google ScholarGoogle ScholarCross RefCross Ref
  156. Zheng Wang, Jie Zhou, Jing Ma, Jingjing Li, Jiangbo Ai, and Yang Yang. 2020. Discovering attractive segments in the user-generated video streams. Info. Process. Manage. 57, 1 (2020), 102130.Google ScholarGoogle ScholarCross RefCross Ref
  157. Daniel Weinland, Remi Ronfard, and Edmond Boyer. 2006. Free viewpoint action recognition using motion history volumes. Comput. Vision Image Understand. 104, 2–3 (2006), 249–257.Google ScholarGoogle ScholarDigital LibraryDigital Library
  158. F. P. Wheeler. 1998. Bayesian forecasting and dynamic models (2nd edn). J. Operat. Res. Soc. 49, 2 (1998), 179–180.Google ScholarGoogle ScholarCross RefCross Ref
  159. Sinead Williamson, Chong Wang, Katherine Heller, and David Blei. 2010. The IBP compound Dirichlet process and its application to focused topic modeling. In Proceedings of the ICML. 1151–1158. https://icml.cc/Conferences/2010/papers/397.pdf.Google ScholarGoogle Scholar
  160. Chenxia Wu, Jiemi Zhang, Ozan Sener, Bart Selman, Silvio Savarese, and Ashutosh Saxena. 2018. Watch-n-patch: Unsupervised learning of actions and relations. IEEE Trans. Pattern Anal. Mach. Intell. 40, 2 (2018), 467–481.Google ScholarGoogle ScholarDigital LibraryDigital Library
  161. Jun Xu, Tao Mei, Ting Yao, and Yong Rui. 2016. Msr-vtt: A large video description dataset for bridging video and language. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5288–5296.Google ScholarGoogle ScholarCross RefCross Ref
  162. Xun Xu, Timothy M. Hospedales, and Shaogang Gong. 2017. Discovery of shared semantic spaces for multiscene video query and summarization. IEEE Trans. Circ. Syst. Video Technol. 27, 6 (2017), 1353–1367.Google ScholarGoogle ScholarDigital LibraryDigital Library
  163. Junyu Xuan, Jie Lu, Guangquan Zhang, and Xiangfeng Luo. 2015. Topic model for graph mining. IEEE Trans. Cybernet. 45, 12 (2015), 2792–2803.Google ScholarGoogle ScholarCross RefCross Ref
  164. Jianfei Xue and Koji Eguchi. 2018. Sequential Bayesian nonparametric multimodal topic models for video data analysis. IEICE Trans. Info. Syst. 101, 4 (2018), 1079–1087.Google ScholarGoogle ScholarCross RefCross Ref
  165. Jianfei Xue and Koji Eguchi. 2019. Supervised nonparametric multimodal topic models for multi-class video classification. ITE Trans. Media Technol. Appl. 7, 2 (2019), 80–91.Google ScholarGoogle ScholarCross RefCross Ref
  166. Michael Ying Yang, Wentong Liao, Yanpeng Cao, and Bodo Rosenhahn. 2018. Video event recognition and anomaly detection by combining gaussian process and hierarchical dirichlet process models. Photogram. Eng. Remote Sens. 84, 4 (2018), 203–214.Google ScholarGoogle ScholarCross RefCross Ref
  167. Shuang Yang, Chunfeng Yuan, Weiming Hu, and Xinmiao Ding. 2014. A hierarchical model based on latent dirichlet allocation for action recognition. In Proceedings of the International Conference on Pattern Recognition. IEEE, 2613–2618.Google ScholarGoogle ScholarDigital LibraryDigital Library
  168. Shuang Yang, Chunfeng Yuan, Baoxin Wu, Weiming Hu, and Fangshi Wang. 2015. Multi-feature max-margin hierarchical Bayesian model for action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1610–1618.Google ScholarGoogle ScholarCross RefCross Ref
  169. Yang Yang, Jingen Liu, and Mubarak Shah. 2009. Video scene understanding using multi-scale analysis. In Proceedings of the IEEE International Conference on Computer Vision. IEEE, 1669–1676.Google ScholarGoogle ScholarCross RefCross Ref
  170. Litao Yu, Zi Huang, Jiewei Cao, and Heng Tao Shen. 2016. Scalable video event retrieval by visual state binary embedding. IEEE Trans. Multimedia 18, 8 (2016), 1590–1603.Google ScholarGoogle ScholarDigital LibraryDigital Library
  171. Niange Yu, Xiaolin Hu, Binheng Song, Jian Yang, and Jianwei Zhang. 2018. Topic-oriented image captioning based on order-embedding. IEEE Trans. Image Process. 28, 6 (2018), 2743–2754.Google ScholarGoogle ScholarCross RefCross Ref
  172. Yin Yuan, Haomian Zheng, Zhu Li, and David Zhang. 2010. Video action recognition with spatio-temporal graph embedding and spline modeling. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2422–2425.Google ScholarGoogle ScholarCross RefCross Ref
  173. Yun Zhai and Mubarak Shah. 2006. Video scene segmentation using Markov chain Monte Carlo. IEEE Trans. Multimedia 8, 4 (2006), 686–697.Google ScholarGoogle ScholarDigital LibraryDigital Library
  174. Jianguo Zhang and Shaogang Gong. 2010. Action categorization by structural probabilistic latent semantic analysis. Comput. Vision Image Understand. 114, 8 (2010), 857–864.Google ScholarGoogle ScholarDigital LibraryDigital Library
  175. Bin Zhao, Wei Xu, Genlin Ji, and Chao Tan. 2015. Discovering topic evolution topology in a microblog corpus. In Proceedings of the International Conference on Advanced Cloud and Big Data. IEEE, 7–14.Google ScholarGoogle ScholarCross RefCross Ref
  176. Fang Zhao, Yongzhen Huang, Liang Wang, and Tieniu Tan. 2013. Relevance topic model for unstructured social group activity recognition. In Advances in Neural Information Processing Systems. MIT Press, 2580–2588.Google ScholarGoogle Scholar
  177. Liang Zhao, Lin Shang, Yang Gao, Yubin Yang, and Xiuyi Jia. 2013. Video behavior analysis using topic models and rough sets [applications notes]. IEEE Comput. Intell. Mag. 8, 1 (2013), 56–67.Google ScholarGoogle ScholarDigital LibraryDigital Library
  178. Zhicheng Zhao, Yifan Song, and Fei Su. 2016. Specific video identification via joint learning of latent semantic concept, scene and temporal structure. Neurocomputing 208 (2016), 378–386.Google ScholarGoogle ScholarDigital LibraryDigital Library
  179. Yin Zheng, Yu-Jin Zhang, and Hugo Larochelle. 2014. Topic modeling of multimodal data: An autoregressive approach. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1370–1377.Google ScholarGoogle ScholarDigital LibraryDigital Library
  180. Bolei Zhou, Xiaogang Wang, and Xiaoou Tang. 2011. Random field topic model for semantic region analysis in crowded scenes from tracklets. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3441–3448.Google ScholarGoogle ScholarDigital LibraryDigital Library
  181. Bolei Zhou, Xiaogang Wang, and Xiaoou Tang. 2012. Understanding collective crowd behaviors: Learning a mixture model of dynamic pedestrian-agents. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2871–2878.Google ScholarGoogle Scholar
  182. Houkui Zhou, Huimin Yu, Roland Hu, Guangqun Zhang, Junguo Hu, and Tao He. 2019. Analyzing multiple types of behaviors from traffic videos via nonparametric topic model. J. Visual Commun. Image Represent. 64 (2019), 102649.Google ScholarGoogle ScholarCross RefCross Ref
  183. Qiqi Zhu, Yanfei Zhong, Liangpei Zhang, and Deren Li. 2017. Scene classification based on the fully sparse semantic topic model. IEEE Trans. Geosci. Remote Sens. 55, 10 (2017), 5525–5538.Google ScholarGoogle ScholarCross RefCross Ref
  184. Xudong Zhu and Hui Li. 2012. Unsupervised human action categorization using latent Dirichlet Markov clustering. In Proceedings of the International Conference on Intelligent Networking and Collaborative Systems. IEEE, 347–352.Google ScholarGoogle ScholarDigital LibraryDigital Library
  185. Xudong Zhu and Zhijing Liu. 2011. Human behavior clustering for anomaly detection. Front. Comput. Sci. China 5, 3 (2011), 279.Google ScholarGoogle ScholarDigital LibraryDigital Library
  186. Jialing Zou, Qixiang Ye, Yanting Cui, David Doermann, and Jianbin Jiao. 2014. A belief-based correlated topic model for trajectory clustering in crowded video scenes. In Proceedings of the International Conference on Pattern Recognition. IEEE, 2543–2548.Google ScholarGoogle ScholarDigital LibraryDigital Library
  187. Jialing Zou, Qixiang Ye, Yanting Cui, Fang Wan, Kun Fu, and Jianbin Jiao. 2016. Collective motion pattern inference via locally consistent latent dirichlet allocation. Neurocomputing 184 (2016), 221–231.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Topic-based Video Analysis: A Survey

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Computing Surveys
      ACM Computing Surveys  Volume 54, Issue 6
      Invited Tutorial
      July 2022
      799 pages
      ISSN:0360-0300
      EISSN:1557-7341
      DOI:10.1145/3475936
      Issue’s Table of Contents

      Copyright © 2021 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 July 2021
      • Accepted: 1 March 2021
      • Revised: 1 January 2021
      • Received: 1 December 2019
      Published in csur Volume 54, Issue 6

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format