Fast Collective Activity Recognition Under Weak Supervision.,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Fast Collective Activity Recognition Under Weak Supervision.
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2019-05-30 , DOI: 10.1109/tip.2019.2918725
Peizhen Zhang , Yongyi Tang , Jian-Fang Hu , Wei-Shi Zheng

Collective activity recognition, which tells what activity a group of people is performing, is a cutting-edge research topic in computer vision. Different from action performed by individuals, collective activity needs to consider the complex interactions among different people. However, most previous works require exhaustive annotations such as accurate label information of individual actions, pairwise interactions, and poses, which could not be easily available in practice. Moreover, most of them treat human detection as a decoupled task before collective activity recognition and leverage all detected persons. This not only ignores the mutual relation between the two tasks, which makes it hard for filtering out irrelevant people, but also probably increases the computation burden when reasoning the collective activities. In this paper, we propose a fast weakly supervised deep learning architecture for collective activity recognition. For fast inference, we propose to make the actor detection and weakly supervised collective activity reasoning collaborate in an end-to-end framework by sharing convolutional layers between them. The joint learning makes the two tasks united and reinforced each other, so that it is more effective to filter out the outliers who are not involved in the activity. For the weakly supervised learning, we propose a latent embedding scheme for mining person-group interactive relationship to get rid of the use of any pairwise relation between people and the individual action labels as well. The experimental results show that the proposed framework achieves comparable or even better performance as compared to the state-of-the-art on three datasets. Our joint modelling reasons collective activities at the speed of 22.65 fps, which is the fastest ever known and substantially makes collective activity recognition more towards real-time applications.

中文翻译：

弱监督下的快速集体活动识别。

集体活动识别可以告诉一群人正在执行的活动，是计算机视觉领域的前沿研究主题。与个人所采取的行动不同，集体活动需要考虑不同人之间的复杂互动。但是，大多数先前的工作都需要详尽的注释，例如单个动作，成对交互和姿势的准确标签信息，这在实践中不容易获得。此外，他们中的大多数将人类检测视为集体活动识别之前的分离任务，并利用所有检测到的人。这不仅忽略了两个任务之间的相互关系，这使得很难过滤掉无关的人，而且在推理集体活动时可能会增加计算负担。在本文中，我们提出了一种用于集体活动识别的快速弱监督深度学习架构。为了进行快速推理，我们建议通过在角色检测和弱监督的集体活动推理之间共享卷积层，在端到端框架中进行协作。联合学习使这两个任务相互结合并相互加强，因此更有效地过滤出不参与活动的异常值。对于弱监督学习，我们提出了一种潜在的嵌入方案，用于挖掘人与人之间的互动关系，从而摆脱了人与个人动作标签之间任何成对关系的使用。实验结果表明，与三个数据集上的最新技术相比，该框架可实现相当甚至更好的性能。

更新日期：2020-04-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11