当前位置: X-MOL 学术Comput. Electr. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Vectors of temporally correlated snippets for temporal action detection
Computers & Electrical Engineering ( IF 4.3 ) Pub Date : 2020-07-01 , DOI: 10.1016/j.compeleceng.2020.106654
Fiza Murtaza , Muhammad Haroon Yousaf , Sergio A. Velastin , Yu Qian

Abstract Detection of human actions in long untrimmed videos is an important but challenging task due to the unconstrained nature of actions present in untrimmed videos. We argue that untrimmed videos contain multiple snippets from actions and the background classes having significant correlation with each other, which results in imprecise detection of start-end times for action regions. In this work, we propose Vectors of Temporally Correlated Snippets (VTCS) which addresses this problem by finding the snippet-centroids from each class which are discriminant for their own class. For each untrimmed video, non-overlapping snippets are temporally correlated with the snippet-centroids using VTCS encoding to find the action proposals. We evaluate the performance of VTCS on the Thumos14 and ActivityNet datasets. For Thumos14, VTCS achieves a significant gain in mean Average Precision (mAP) at temporal Intersection over Union (tIoU) threshold 0.5, improving from 41.5% to 44.3%. For the sports-subset of ActivityNet dataset, VTCS obtains 38.5% mAP @0.5 tIoU threshold.

中文翻译:

用于时间动作检测的时间相关片段的向量

摘要 由于未修剪视频中存在的动作不受约束,因此在未修剪的长视频中检测人类动作是一项重要但具有挑战性的任务。我们认为未经修剪的视频包含来自动作的多个片段,并且背景类别彼此之间具有显着的相关性,这导致动作区域开始-结束时间的检测不精确。在这项工作中,我们提出了时间相关片段向量 (VTCS),它通过从每个类中找到对其自己的类具有判别性的片段质心来解决这个问题。对于每个未修剪的视频,使用 VTCS 编码找到动作建议的非重叠片段与片段质心在时间上相关。我们评估了 VTCS 在 Thumos14 和 ActivityNet 数据集上的性能。对于 Thumos14,VTCS 在联合时间交叉 (tIoU) 阈值 0.5 处的平均平均精度 (mAP) 显着提高,从 41.5% 提高到 44.3%。对于 ActivityNet 数据集的运动子集,VTCS 获得 38.5% mAP @0.5 tIoU 阈值。
更新日期:2020-07-01
down
wechat
bug