Skip to main content
Log in

Advance on large scale near-duplicate video retrieval

  • Review Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Emerging Internet services and applications attract increasing users to involve in diverse video-related activities, such as video searching, video downloading, video sharing and so on. As normal operations, they lead to an explosive growth of online video volume, and inevitably give rise to the massive near-duplicate contents. Near-duplicate video retrieval (NDVR) has always been a hot topic. The primary purpose of this paper is to present a comprehensive survey and an updated review of the advance on large-scale NDVR to supply guidance for researchers. Specifically, we summarize and compare the definitions of near-duplicate videos (NDVs) in the literature, analyze the relationship between NDVR and its related research topics theoretically, describe its generic framework in detail, investigate the existing state-of-the-art NDVR systems. Finally, we present the development trends and research directions of this topic.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Khan N, Yaqoob I, Hashem I A T, Inayat Z, Ali W K M, Alam M, Shiraz M, Gani A. Big data: survey, technologies, opportunities, and challenges. The Scientific World Journal, 2014, 2014: 712826

    Google Scholar 

  2. Wu X, Hauptmann A G, Ngo C W. Practical elimination of near-duplicates from web video search. In: Proceedings of the 15th ACM International Conference on Multimedia. 2007, 218–227

  3. Davidson J, Liebald B, Liu J, Nandy P, Vleet T V. The youtube video recommendation system. In: Proceedings of the 4th ACM Conference on Recommender Systems. 2010, 293–296

  4. Yang B, Mei T, Hua X S, Yang L, Yang S Q, Li M J. Online video recommendation based on multimodal fusion and relevance feedback. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval. 2007, 73–80

  5. Koch E, Rindfre J, Zhao J. Copyright protection for multimedia data. In: Proceedings of the International Conference on Digital Media and Electronic Publishing. 1994

  6. Zhou X, Chen L. Monitoring near duplicates over video streams. In: Proceedings of the 18th ACM International Conference on Multimedia. 2010, 521–530

  7. Tamilselvi J J, Gifta C B. Handling duplicate data in data warehouse for data mining. International Journal of Computer Applications, 2011, 15(4): 7–15

    Google Scholar 

  8. Chen M S, Han J, Yu P S. Data mining: an overview from a database perspective. IEEE Transactions on Knowledge and Data Engineering, 2002, 8(6): 866–883

    Google Scholar 

  9. Wu X, Ide I, Satoh S. News topic tracking and re-ranking with query expansion based on near-duplicate detection. In: Proceedings of Pacific-Rim Conference on Multimedia. 2009, 755–766

  10. Shen H T, Zhou X, Huang Z, Shao J, Zhou X. UQLIPS: a realtime near-duplicate video clip detection system. In: Proceedings of the 33rd International Conference on Very Large Data Bases. 2007, 1374–1377

  11. Liu J, Huang Z, Cai H, Shen H T, Ngo C W, Wang W. Near-duplicate video retrieval: current research and future trends. ACM Computing Surveys, 2013, 45(4): 44

    Google Scholar 

  12. Cherubini M, Oliveira R D, Oliver N. Understanding near-duplicate videos: a user-centric approach. In: Proceedings of the 17th ACM International Conference on Multimedia. 2009, 35–44

  13. Chou C L, Chen H T, Lee S Y. Pattern-based near-duplicate video retrieval and localization on web-scale videos. IEEE Transactions on Multimedia, 2015, 17(3): 382–395

    Google Scholar 

  14. Zhang J R, Ren J Y, Chang F, Wood T L, Kender J R. Fast near-duplicate video retrieval via motion time series matching. In: Proceedings of the IEEE International Conference on Multimedia and Expo. 2012, 842–847

  15. Basharat A, Zhai Y, Shah M. Content based video matching using spatiotemporal volumes. Computer Vision and Image Understanding, 2008, 110(3): 360–377

    Google Scholar 

  16. Smeulders A W M, Woning M, Santini S, Gupta A, Jain R. Content-based image retrieval at the end of the early. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(12): 1349–1380

    Google Scholar 

  17. Yan Y, Ooi B C, Zhou A. Continuous content-based copy detection over streaming videos. In: Proceedings of the 24th IEEE International Conference on Data Engineering. 2008, 853–862

  18. Mou L, Huang T, Tian Y, Jiang M, Gao W. Content-based copy detection through multimodal feature representation and temporal pyramid matching. ACM Transactions on Multimedia Computing Communications and Applications, 2013, 10(1): 1–20

    Google Scholar 

  19. Hong R, Yang Y, Wang M, Hua X S. Learning visual semantic relationships for efficient visual retrieval. IEEE Transactions on Big Data, 2017, 1(4): 152–161

    Google Scholar 

  20. Saravanan M S G, Sivaprakasam M T, Somasundaram D. A review on content based video retrieval, classification and summarization. Asian Journal of Applied Science and Technology, 2017, 1(9): 40–45

    Google Scholar 

  21. Xie Q, Huang Z, Shen H T, Zhou X, Pang C. Efficient and continuous near-duplicate video detection. In: Proceedings of the 12th International Asia-Pacific Web Conference. 2010, 260–266

  22. Nie X, Chai Y, Liu J, Sun J, Yin Y. Spherical torus-based video hashing for near-duplicate video detection. Science China Information Sciences, 2016, 59(5): 059101

    Google Scholar 

  23. da Silva H B, do Patrocínio Z K, Gravier G, Amsaleg L, Araújo A D A, Guimaraes S J F. Near-duplicate video detection based on an approximate similarity self-join strategy. In: Proceedings of the 14th International Workshop on Content-Based Multimedia Indexing. 2016, 1–6

  24. Lameri S, Bondi L, Bestagini P, Tubaro S. Near-duplicate video detection exploiting noise residual traces. In: Proceedings of the IEEE International Conference on Image Processing. 2017, 1497–1501

  25. Washino K, Schwab B H. Video monitoring and conferencing system. U.S. Patent No. 5,625,410. 1997-4-29

  26. Jiang J, Tong Y, Lu H, Cui B, Lei K, Yu L. GVoS: a general system for near-duplicate video-related applications on storm. ACM Transactions on Information Systems, 2017, 36(1): 3

    Google Scholar 

  27. Huang Z, Wang L, Shen H T, Shao J, Zhou X. Online near-duplicate video clip detection and retrieval: an accurate and fast system. In: Proceedings of the 25th IEEE International Conference on Data Engineering. 2009, 1511–1514

  28. Kraaij W, Awad G. TRECVID 2011 content-based copy detection: task overview. Online Proceedings of TRECVid, 2011

  29. Awad G, Fiscus J, Kraaij W. TRECVID 2011-an overview of the goals, tasks, data, evaluation mechanisms, and metrics. National Institute of Standards and Technology, 2014, 1–58

  30. Smeaton A F, Over P, Kraaij W. Evaluation campaigns and TRECVid. In: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval. 2006, 321–330

  31. Law-To J, Chen L, Joly A, Laptev I, Buisson O, Gouet-Brunet V, Boujemaa N, Stentiford F. Video copy detection: a comparative study. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval. 2007, 371–378

  32. Hampapur A, Bolle R M. Comparison of sequence matching techniques for video copy detection. In: Proceedings of SPIE Storage and Retrieval for Media Databases. 2002, 194–202

  33. Zobel J, Hoad T C. Detection of video sequences using compact signatures. ACM Transactions on Information Systems, 2006, 24(1): 1–50

    Google Scholar 

  34. Joly A, Buisson O, Frelicot C. Content-based copy retrieval using distortion-based probabilistic similarity search. IEEE Transactions on Multimedia, 2007, 9(2): 293–306

    Google Scholar 

  35. Yeh M C, Cheng K T. Video copy detection by fast sequence matching. In: Proceedings of the ACM International Conference on Image and Video Retrieval. 2009, 45

  36. Kraaij W, Awad G, Over P. TRECVID-2008 content-based copy detection task overview (slides). National Institute of Standards and Technology, 2008

  37. Aigrain P, Zhang H, Petkovic D. Content-based representation and retrieval of visual media: a state-of-the-art review. Multimedia Tools and Applications, 1996, 3(3): 179–202

    Google Scholar 

  38. Hu W, Xie N, Li L, Maybank S. A survey on visual content-based video indexing and retrieval. IEEE Transactions on Systems Man and Cybernetics, Part C, 2011, 41(6): 797–819

    Google Scholar 

  39. Hong R, Tang J, Tan H K, Ngo C W, Yan S, Chua T S. Beyond search: event-driven summarization for web videos. ACM Transactions on Multimedia Computing Communications and Applications, 2011, 7(4): 35

    Google Scholar 

  40. Chua T S, Hong R, Li G, Tang J. From text question-answering to multimedia QA on web-scale media resources. In: Proceedings of the 1st ACM Workshop on Large-Scale Multimedia Retrieval and Mining. 2009, 51–58

  41. Zhao W L, Ngo C W, Tan H K, Wu X. Near-duplicate keyframe identification with interest point matching and pattern learning. IEEE Transactions on Multimedia, 2007, 9(5): 1037–1048

    Google Scholar 

  42. Wu X, Zhao W L, Ngo C W. Near-duplicate keyframe retrieval with visual keywords and semantic context. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval. 2007, 162–169

  43. Geetha P, Narayanan V. A survey of content-based video retrieval. Journal of Computer Science, 2008, 4(6): 734

    Google Scholar 

  44. Wu X, Zhao W L, Ngo C W. Efficient near-duplicate keyframe retrieval with visual language models. In: Proceedings of IEEE International Conference on Multimedia and Expo. 2007, 500–503

  45. Yeo C, Zhu Y W, Sun Q, Chang S F. A framework for sub-window shot detection. In: Proceedings of the 11th International Multimedia Modelling Conference. 2005, 84–91

  46. Satoh S, Takimoto M, Adachi J. Scene duplicate detection from videos based on trajectories of feature points. In: Proceedings of the International Workshop on Multimedia Information Retrieval. 2007, 237–244

  47. Hong R, Wang M, Xu M, Yan S, Chua T S. Dynamic captioning: video accessibility enhancement for hearing impairment. In: Proceedings of the 18th ACM International Conference on Multimedia. 2010, 421–430

  48. Wang M, Hong R, Yuan X T, Yan S, Chua T S. Movie2Comics: towards a lively video content presentation. IEEE Transactions on Multimedia, 2012, 14(3): 858–870

    Google Scholar 

  49. Birchfield S T, Rangarajan S. Spatiograms versus histograms for region-based tracking. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2005, 1158–1163

  50. Li J, Wu W, Wang T, Zhang Y. One step beyond histograms: image representation using Markov stationary features. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2008, 1–8

  51. Shang L, Chan K P, Hua X S. Real-time large scale near-duplicate web video retrieval. In: Proceedings of the 18th ACM International Conference on Multimedia. 2010, 531–540

  52. Song J, Yang Y, Huang Z, Shen H T, Luo J. Effective multiple feature hashing for large-scale near-duplicate video retrieval. IEEE Transactions on Multimedia, 2013, 15(8): 1997–2008

    Google Scholar 

  53. Swain M J, Ballard D H. Color indexing. International Journal of Computer Vision, 1991, 7(1): 11–32

    Google Scholar 

  54. Bhat D N, Nayar S K. Ordinal measures for image correspondence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(4): 415–423

    Google Scholar 

  55. Dong W, Wang Z, Charikar M, Li K. Efficiently matching sets of features with random histograms. In: Proceedings of the 16th ACM International Conference on Multimedia. 2008, 179–188

  56. Ke Y, Sukthankar R. PCA-SIFT: a more distinctive representation for local image descriptors. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2004, 506–513

  57. Ke Y, Sukthankar R, Huston L. Efficient near-duplicate detection and sub-image retrieval. In: Proceedings of ACM International Conference on Multimedia. 2004

  58. Lowe D G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91–110

    Google Scholar 

  59. Lowe D G. Object recognition from local scale-invariant features. In: Proceedings of IEEE International Conference on Computer Vision. 1999, 1150–1157

  60. Bay H, Tuytelaars T, Van Gool L. SURF: speeded up robust features. In: Proceedings of European Conference on Computer Vision. 2006, 404–417

  61. Yang G, Chen N, Jiang Q. A robust hashing algorithm based on SURF for video copy detection. Computers and Security, 2012, 31(1): 33–39

    Google Scholar 

  62. Hao Y, Mu T, Hong R, Wang M, An N, Goulermas J Y. Stochastic multiview hashing forlarge-scale near-duplicate video retrieval. IEEE Transactions on Multimedia, 2017, 19(1): 1–14

    Google Scholar 

  63. Zhao G, Pietikainen M. Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(6): 915–928

    Google Scholar 

  64. Hao Y, Mu T, Goulermas J Y, Jiang J, Hong R, Wang M. Unsupervised t-distributed video hashing and its deep hashing extension. IEEE Transactions on Image Processing, 2017, 26(11): 5531–5544

    MathSciNet  MATH  Google Scholar 

  65. Chum O, Philbin J, Zisserman A. Near duplicate image detection: min-hash and TF-IDF weighting. In: Proceedings of the British Machine Vision Conference. 2008, 812–815

  66. Jing W, Nie X, Cui C, Xi X, Yang G, Yin Y. Global-view hashing: harnessing global relations in near-duplicate video retrieval. World Wide Web, 2019, 22(2): 771–789

    Google Scholar 

  67. Nie X, Li X, Sun J, Yin Y. UFvH: unified feature video hashing for near-duplicate video retrieval. In: Proceedings of the Workshop on Visual Analysis in Smart and Connected Communities. 2017, 17–24

  68. Law-To J, Buisson O, Gouet-Brunet V, Boujemaa N. Robust voting algorithm based on labels of behavior for video copy detection. In: Proceedings of the 14th ACM International Conference on Multimedia. 2006, 835–844

  69. Zhang J R, Ren J Y, Chang F, Wood T L, Kender J R. Fast near-duplicate video retrieval via motion time series matching. In: Proceedings of IEEE International Conference on Multimedia and Expo. 2012, 842–847

  70. Chou C L, Chen H T, Chen Y C, Ho C P, Lee S Y. Near-duplicate video retrieval and localization using pattern set based dynamic programming. In: Proceedings of IEEE International Conference on Multimedia and Expo. 2013, 1–6

  71. Hua X S, Chen X, Zhang H J. Robust video signature based on ordinal measure. In: Proceedings of International Conference on Image Processing. 2004, 685–688

  72. Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 15th International Conference on Neural Information Processing Systems. 2012, 1097–1105

  73. Razavian A S, Sullivan J, Maki A, Carlsson S. A baseline for visual instance retrieval with deep convolutional networks. In: Proceedings of International Conference on Learning Representations. 2015

  74. Razavian A S, Azizpour H, Sullivan J, Carlsson S. CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2014, 806–813

  75. Xu Z, Yang Y, Hauptmann A G. A discriminative CNN video representation forevent detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 1798–1807

  76. Kordopatis-Zilos G, Papadopoulos S, Patras I, Kompatsiaris Y. Near-duplicate video retrieval by aggregating intermediate CNN layers. In: Proceedings of International Conference on Multimedia Modeling. 2017, 251–263

  77. Tran D, Bourdev L, Fergus R, Torresani L, Paluri M. Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 4489–4497

  78. Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks. In: Proceedings of the 27 th International Conference on Neural Information Processing Systems. 2014, 3104–3112

  79. Zhang H, Wang M, Hong R, Chua T S. Play and rewind: optimizing binary representations of videos by self-supervised temporal hashing. In: Proceedings of the 2016 ACM Multimedia Conference. 2016, 781–790

  80. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8), 1735–1780

    Google Scholar 

  81. Cho K, Van Merriénboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014, 1724–1734

  82. Song J, Yang Y, Huang Z, Shen H T, Hong R. Multiple feature hashing for real-time large scale near-duplicate video retrieval. In: Proceedings of the 19th ACM International Conference on Multimedia. 2011, 423–432

  83. Zhao W L, Tan S, Ngo C W. Large-scale near-duplicate web video search: challenge and opportunity. In: Proceedings of IEEE International Conference on Multimedia and Expo. 2009, 1624–1627

  84. Jiang Y G, Ngo C W. Visual word proximity and linguistics for semantic video indexing and near-duplicate retrieval. Computer Vision and Image Understanding, 2009, 113(3): 405–414

    Google Scholar 

  85. Liu L, Lai W, Hua X S, Yang S Q. Video histogram: a novel video signature for efficient web video duplicate detection. In: Proceedings of International Conference on Multimedia Modeling. 2007, 94–103

  86. Huang Z, Shen H T, Shao J, Zhou X. Bounded coordinate system indexing for real-time video clip search. ACM Transactions on Information Systems, 2009, 27(3): 17

    Google Scholar 

  87. Shen H T, Ooi B C, Zhou X. Towards effective indexing for very large video sequence database. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data. 2005, 730–741

  88. Kordopatis-Zilos G, Papadopoulos S, Patras I, Kompatsiaris Y. Near-duplicate video retrieval with deep metric learning. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 347–356

  89. Böhm C, Berchtold S, Keim D A. Searching in high-dimensional spaces: index structures for improving the performance of multimedia databases. ACM Computing Surveys, 2001, 33(3): 322–373

    Google Scholar 

  90. Snoek C G M, Worring M. Multimodal video indexing: a review of the state-of-the-art. Multimedia Tools and Applications, 2005, 25(1): 5–35

    Google Scholar 

  91. Boughorbel S, Tarel J P, Boujemaa N. Generalized histogram intersection kernel for image recognition. In: Proceedings of IEEE International Conference on Image Processing. 2005, 3: III-161

  92. Wu J, Rehg J M. Beyond the Euclidean distance: creating effective visual codebooks using the histogram intersection kernel. In: Proceedings of the 12th IEEE International Conference on Computer Vision. 2009, 630–637

  93. Jagadish H V, Ooi B C, Tan K L, Yu C, Zhang R. iDistance: an adaptive B +-tree based indexing method for nearest neighbor search. ACM Transactions on Database Systems, 2005, 30(2): 364–397

    Google Scholar 

  94. Bayer R, Mccreight E. Organization and Maintenance of Large Ordered Indexes. Software Pioneers, Springer, Berlin, Heidelberg, 2002, 245–262

    Google Scholar 

  95. Bohm C, Gruber M, Kunath P, Pryakhin A, Schubert M. Prover: probabilistic video retrieval using the gauss-tree. In: Proceedings of the 23rd IEEE International Conference on Data Engineering. 2007, 1521–1522

  96. Chen M, Mao S, Liu Y. Big data: a survey. Mobile Networks and Applications, 2014, 19(2): 171–209

    Google Scholar 

  97. Wang J, Zhang T, Song J, Sebe N, Shen H T. A survey on learning to hash. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 769–790

    Google Scholar 

  98. Wang J, Shen H T, Song J, Song J, Ji J. Hashing forsimilarity search: a survey. 2014, arXiv preprint arXiv:1408.2927

  99. Zhou X, Chen L, Zhou X. Structure tensor series-based large scale near-duplicate video retrieval. IEEE Transactions on Multimedia, 2012, 14(4): 1220–1233

    Google Scholar 

  100. Wang Y, Belkhatir M, Tahayna B. Near-duplicate video retrieval based on clustering by multiple sequence alignment. In: Proceedings of the 20th ACM International Conference on Multimedia. 2012, 941–944

  101. Tan H K, Ngo C W, Chua T S. Efficient mining of multiple partial near-duplicate alignments by temporal network. IEEE Transactions on Circuits and Systems for Video Technology, 2010, 20(11): 1486–1498

    Google Scholar 

  102. Ngo C W, Zhao W L, Jiang Y G. Fast tracking of near-duplicate keyframes in broadcast domain with transitivity propagation. In: Proceedings of the 14th ACM International Conference on Multimedia. 2006, 845–854

  103. Donoser M, Bischof H. Diffusion processes for retrieval revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2013, 1320–1327

  104. Bai S, Bai X, Tian Q, Latecki L J. Regularized diffusion process on bidirectional context for object retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(5): 1213–1226

    Google Scholar 

  105. Mei T, Rui Y, Li S, Tian Q. Multimedia search reranking: a literature survey. ACM Computing Surveys, 2014, 46(3): 38

    Google Scholar 

  106. Bai S, Bai X. Sparse contextual activation for efficient visual reranking. IEEE Transactions on Image Processing, 2016, 25(3): 1056–1069

    MathSciNet  MATH  Google Scholar 

  107. Over P, Awad G, Michel M, Fiscus J, Kraaij W, Smeaton A F. TRECVID 2009 — goals, tasks, data, evaluation mechanisms and metrics. TRECVID 2009 papers, 2010, 1–42

  108. Law-To J, Joly A, Boujemaa N. Muscle-VCD-2007: a live benchmark for video copy detection. Google Scholar, 2007

  109. Ren J, Chang F, Wood T, Zhang J R. Efficient video copy detection via aligning video signature time series. In: Proceedings of the 2nd ACM International Conference on Multimedia Retrieval. 2012, 14

  110. Karpenko A, Aarabi P. Tiny videos: a large data set for nonparametric video retrieval and frame classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(3): 618

    Google Scholar 

  111. Tan H K, Wu X, Ngo C W, Zhao W L. Accelerating near-duplicate video matching by combining visual similarity and alignment distortion. In: Proceedings of the 16th ACM International Conference on Multimedia. 2008, 861–864

  112. Wu X, Ngo C W, Hauptmann A G, Tan H K. Real-rime near-duplicate elimination for web video search with content and context. IEEE Transactions on Multimedia, 2009, 11(2): 196–207

    Google Scholar 

  113. Venna J, Peltonen J, Nybo K, Aidos H, Kaski S. Information retrieval perspective to nonlinear dimensionality reduction for data visualization. Journal of Machine Learning Research, 2010, 11(1): 451–490

    MathSciNet  MATH  Google Scholar 

  114. Hinton G E, Roweis S T. Stochastic neighbor embedding. In: Proceedings of the 15th International Conference on Neural Information Processing Systems. 2003, 857–864

  115. Maaten L V D, Hinton G. Visualizing data using t-SNE. Journal of Machine Learning Research, 2008, 9(Nov): 2579–2605

    MATH  Google Scholar 

  116. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 1–9

  117. Ali S R, Sullivan J, Maki A, Carlsson S. A baseline for visual instance retrieval with deep convolutional networks. In: Proceedings of International Conference on Learning Representations. 2015

  118. Zheng L, Zhao Y, Wang S, Wang J, Tian Q. Good practice in CNN feature transfer. 2016, arXiv preprint arXiv:1604.00133

  119. Peng Y, Qi J, Yuan Y. CM-GANs: cross-modal generative adversarial networks for common representation learning. ACM Transactions on Multimedia Computing, Communications, and Applications, 2019, 15(1): 22

    Google Scholar 

  120. Zhang J, Peng Y, Yuan M. SCH-GAN: semi-supervised cross-modal hashing by generative adversarial network. IEEE Transactions on Cybernetics, 2018

Download references

Acknowledgements

The work was supported by the National Natural Science Foundation of China (Grant Nos. 61722204, 61732007 and 61632007).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Richang Hong.

Additional information

Ling Shen received the Graduate degree from Anhui University, China in 2010. She is currently a PhD student in Computer and Information Institute, Hefei University of Technology, China. Her research interests mainly include pattern recognition, machine learning and multimedia data analysis, such as large-scale multimedia indexing and retrieval, multimedia data embedding.

Richang Hong Awardee of the NSFC Excellent Young Scholars Program in 2017. He received the PhD degree from the University of Science and Technology of China, China in 2008. He was a research fellow with the School of Computing, National University of Singapore, Singapore from 2008 to 2010. He is currently a professor with the Hefei University of Technology, China. He has co-authored over 70 publications in his research interests, which include multimedia content analysis and social media. He is a member of the ACM and the Executive Committee Member of the ACM SIGMM China Chapter. He was a recipient of the Best Paper Award in the ACM Multimedia 2010, the Best Paper Award in the ACM ICMR 2015 and the Honorable Mention of the IEEE TRANSACTIONS ON MULTIMEDIA Best Paper Award. He served as an Associate Editor of the Information Sciences and Signal Processing Elsevier, and the Technical Program Chair of the MMM 2016.

Yanbin Hao received the PhD degree from Hefei University of Technology, China in 2017. He is currently a postdoctoral researcher in Department of Computer Science, City University of Hong Kong, China. His research interests mainly include machine learning and multimedia data analysis, such as large-scale multimedia indexing and retrieval, multimedia data embedding, and video hyperlinking.

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shen, L., Hong, R. & Hao, Y. Advance on large scale near-duplicate video retrieval. Front. Comput. Sci. 14, 145702 (2020). https://doi.org/10.1007/s11704-019-8229-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-019-8229-7

Keywords

Navigation