当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Symbiotic Attention for Egocentric Action Recognition With Object-Centric Alignment.
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 23.6 ) Pub Date : 2023-05-05 , DOI: 10.1109/tpami.2020.3015894
Xiaohan Wang , Linchao Zhu , Yu Wu , Yi Yang

In this paper, we propose to tackle egocentric action recognition by suppressing background distractors and enhancing action-relevant interactions. The existing approaches usually utilize two independent branches to recognize egocentric actions, i.e., a verb branch and a noun branch. However, the mechanism to suppress distracting objects and exploit local human-object correlations is missing. To this end, we introduce two extra sources of information, i.e., the candidate objects spatial location and their discriminative features, to enable concentration on the occurring interactions. We design a Symbiotic Attention with Object-centric feature Alignment framework (SAOA) to provide meticulous reasoning between the actor and the environment. First, we introduce an object-centric feature alignment method to inject the local object features to the verb branch and noun branch. Second, we propose a symbiotic attention mechanism to encourage the mutual interaction between the two branches and select the most action-relevant candidates for classification. The framework benefits from the communication among the verb branch, the noun branch, and the local object information. Experiments based on different backbones and modalities demonstrate the effectiveness of our method. Notably, our framework achieves the state-of-the-art on the largest egocentric video dataset.

中文翻译:

以对象为中心对齐的自我中心动作识别的共生注意。

在本文中,我们建议通过抑制背景干扰因素和增强与动作相关的交互来解决以自我为中心的动作识别。现有方法通常利用两个独立的分支来识别以自我为中心的行为,即动词分支和名词分支。然而,抑制分散注意力的物体和利用局部人-物体相关性的机制是缺失的。为此,我们引入了两个额外的信息源,即候选对象的空间位置及其辨别特征,以便能够专注于发生的交互。我们设计了一个具有以对象为中心的特征对齐框架 (SAOA) 的共生注意力,以在参与者和环境之间提供细致的推理。第一的,我们引入了一种以对象为中心的特征对齐方法,将局部对象特征注入动词分支和名词分支。其次,我们提出了一种共生注意机制,以鼓励两个分支之间的相互作用,并选择与动作最相关的候选人进行分类。该框架得益于动词分支、名词分支和本地对象信息之间的通信。基于不同主干和模态的实验证明了我们方法的有效性。值得注意的是,我们的框架在最大的以自我为中心的视频数据集上达到了最先进的水平。该框架得益于动词分支、名词分支和本地对象信息之间的通信。基于不同主干和模态的实验证明了我们方法的有效性。值得注意的是,我们的框架在最大的以自我为中心的视频数据集上达到了最先进的水平。该框架得益于动词分支、名词分支和本地对象信息之间的通信。基于不同主干和模态的实验证明了我们方法的有效性。值得注意的是,我们的框架在最大的以自我为中心的视频数据集上达到了最先进的水平。
更新日期:2020-08-11
down
wechat
bug