当前位置: X-MOL 学术IET Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Video summarisation with visual and semantic cues
IET Image Processing ( IF 2.3 ) Pub Date : 2020-11-30 , DOI: 10.1049/iet-ipr.2019.1355
Binwei Xu 1 , Haoran Liang 2 , Ronghua Liang 2
Affiliation  

Video summarisation greatly improves the efficiency of people browsing videos and saves storage space. A good video summary should satisfy human visual interestingness and preserve the theme of the original video at the semantic level. Unlike many existing methods that consider only visual features to generate video summaries, this study proposes a method that combines visual and semantic cues to extract important information for dynamic video summarisation. The authors propose visual-verbal saliency consistency to add semantic information and propose a novel attention motion, along with other visual features to fully represent visual interestingness. Based on the importance score of each frame calculated by combining these features, they select an optimal subset of segments to generate an important and interesting summary. They evaluate their method using the SumMe and TVSum datasets and experimental results show that their method generates high-quality video summaries.

中文翻译:

具有视觉和语义提示的视频摘要

视频摘要极大地提高了人们浏览视频的效率并节省了存储空间。良好的视频摘要应能满足人类的视觉兴趣,并在语义级别上保留原始视频的主题。与许多仅考虑视觉特征来生成视频摘要的现有方法不同,本研究提出了一种将视觉和语义线索相结合以提取重要信息以进行动态视频摘要的方法。作者提出视觉-语言显着性一致性,以添加语义信息,并提出新颖的注意动作,以及其他视觉功能,以充分体现视觉趣味性。基于通过组合这些特征而计算出的每个帧的重要性得分,他们选择了最佳的片段子集以生成重要而有趣的摘要。
更新日期:2020-12-01
down
wechat
bug