当前位置: X-MOL 学术Image Vis. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Boundary graph convolutional network for temporal action detection
Image and Vision Computing ( IF 4.7 ) Pub Date : 2021-02-20 , DOI: 10.1016/j.imavis.2021.104144
Yaosen Chen , Bing Guo , Yan Shen , Wei Wang , Weichen Lu , Xinhua Suo

Temporal action proposal generation is a fundamental yet challenging to locate the temporal action in untrimmed videos. Although current proposal generation methods can generate the precise boundary of actions, few focus on considering the relation of proposals. In this paper, we propose a unified framework to generate the temporal boundary proposals with a graph convolution network based on the boundary proposals' feature named Boundary Graph Convolutional Network (BGCN). BGCN draws inspiration from boundary methods and uses edge graph convolution relay on the boundary proposals' feature. First, we use a base layer to fusion the two-stream video features to get two-branches of base features. Then the two-branches of base features enter into the same structure of Proposal Features Graph Convolutional Network (PFGCN): Action PFGCN to extract the action classification score and Boundary PFGCN to extract the ending score and staring score. In proposal features graph convolutional network, we first densely sampled the proposals' feature from the video features. We construct a proposal feature graph, where each proposal feature as a node and their relations between proposals' features as an edge with edge convolution for graph convolution. After that, map the relations into a 2D map score. Experiments on popular benchmarks THUMOS14 demonstrate the superiority of BGCN over (44.8% versus 42.8% at tIoU 0.5) the state-of-the-art proposal generator (e.g., G-TAD, TAL-Net, and BMN) at any of tIoU thresholds from 0.3 to 0.7. On ActivityNet1.3, BGCN also got better results. Moreover, BGCN has high efficiency for action detection with less than 2 MB model size and fast inference time.

GCN based on boundary generation for densely produce the action proposals Efficient and novel BGCN model has a great capability to learn the proposal features Has a lower model size for temporal action proposals generation Has fast inference time for temporal action proposals generation.



中文翻译:

用于时间动作检测的边界图卷积网络

时间动作提议的生成对于在未修剪的视频中定位时间动作是基本但又具有挑战性的。尽管当前的提案生成方法可以生成动作的精确边界,但很少有人关注于考虑提案之间的关系。在本文中,我们提出了一个统一的框架,以基于图卷积网络的时间边界图提案为基础,该框架基于边界图提案的特征称为边界图卷积网络(BGCN)。BGCN从边界方法中汲取了灵感,并在边界提议的特征上使用了边缘图卷积中继。首先,我们使用基础层将两流视频特征融合在一起,以获得两支基础特征。然后,将两个基本特征分支放入提议特征图卷积网络(PFGCN)的相同结构中:动作PFGCN提取动作分类得分,边界PFGCN提取结束得分和凝视得分。在提议特征图卷积网络中,我们首先从视频特征中密集采样了提议的特征。我们构造了一个提议特征图,其中每个提议特征作为一个节点,并且它们之间的提议特征之间的关系作为具有图卷积的边缘卷积的边。之后,将关系映射到2D地图分数中。在流行基准THUMOS14上进行的实验表明,在任何tIoU阈值下,BGCN都比最新建议生成器(例如G-TAD,TAL-Net和BMN)优越(在tIoU 0.5时为44.8%对42.8%)。从0.3到0.7。在ActivityNet1.3上,BGCN也获得了更好的结果。而且,

基于边界生成的GCN可以密集地生成动作建议高效新颖的BGCN模型具有学习建议特征的能力,具有较小的时间动作建议生成模型大小,具有快速的时间动作建议生成推理时间。

更新日期:2021-03-03
down
wechat
bug