Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos,International Journal of Computer Vision

当前位置： X-MOL 学术 › Int. J. Comput. Vis. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos
International Journal of Computer Vision ( IF 19.5 ) Pub Date : 2017-05-22 , DOI: 10.1007/s11263-017-1013-y
Serena Yeung , Olga Russakovsky , Ning Jin , Mykhaylo Andriluka , Greg Mori , Li Fei-Fei

Every moment counts in action recognition. A comprehensive understanding of human activity in video requires labeling every frame according to the actions occurring, placing multiple labels densely over a video sequence. To study this problem we extend the existing THUMOS dataset and introduce MultiTHUMOS, a new dataset of dense labels over unconstrained internet videos. Modeling multiple, dense labels benefits from temporal relations within and across classes. We define a novel variant of long short-term memory deep networks for modeling these temporal relations via multiple input and output connections. We show that this model improves action labeling accuracy and further enables deeper understanding tasks ranging from structured retrieval to action prediction.

中文翻译：

每一刻都很重要：复杂视频中动作的密集详细标签

每时每刻都在行动识别中很重要。全面了解视频中的人类活动需要根据发生的动作标记每一帧，在视频序列上密集放置多个标签。为了研究这个问题，我们扩展了现有的 THUMOS 数据集并引入了 MultiTHUMOS，这是一个新的无约束互联网视频密集标签数据集。对多个密集标签进行建模受益于类内和类间的时间关系。我们定义了一种新的长短期记忆深度网络变体，用于通过多个输入和输出连接对这些时间关系进行建模。我们表明该模型提高了动作标记的准确性，并进一步实现了从结构化检索到动作预测的更深入理解任务。

更新日期：2017-05-22

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>