当前位置: X-MOL 学术Pattern Recogn. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Self-attention binary neural tree for video summarization
Pattern Recognition Letters ( IF 3.9 ) Pub Date : 2020-12-30 , DOI: 10.1016/j.patrec.2020.12.016
Hao Fu , Hongxing Wang

In this paper, we address the problem of shot-level video summarization, which aims at selecting a subset of video shots as a summary to represent the original video contents compactly and completely. Most existing methods rely on various network architectures to learn a single score predictor for shot ranking and selection. Different from these methods, we plug network feature learning into a binary neural tree to consider multi-path predictions for each shot, thus enabling the shot evaluation from different aspects. Due to the hierarchical structure of the tree, video shots can be coarse-to-fine encoded by imposing self-attention on them along branches, leading to favorable predictions. Extensive experiments were conducted on two real-world datasets, and the results reveal that the proposed method achieves superior performance in comparison with previous state-of-the-art methods.



中文翻译:

用于视频摘要的自注意力二叉神经树

在本文中,我们解决了镜头级视频摘要的问题,该问题旨在选择一部分视频镜头作为摘要,以紧凑,完整地表示原始视频内容。大多数现有方法都依赖于各种网络体系结构来学习单个分数预测器,以进行镜头排名和选择。与这些方法不同,我们将网络特征学习插入到二叉神经树中,以考虑每个镜头的多路径预测,从而可以从不同方面评估镜头。由于树的层次结构,可以通过沿分支对它们进行自我关注来对视频镜头进行粗略到细化的编码,从而带来良好的预测。在两个真实的数据集上进行了广泛的实验,

更新日期:2021-01-18
down
wechat
bug