当前位置: X-MOL 学术IEEE Trans. Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Distance-Aware Occlusion Detection With Focused Attention
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 8-22-2022 , DOI: 10.1109/tip.2022.3197984
Yang Li 1 , Yucheng Tu 2 , Xiaoxue Chen 1 , Hao Zhao 3 , Guyue Zhou 1
Affiliation  

For humans, understanding the relationships between objects using visual signals is intuitive. For artificial intelligence, however, this task remains challenging. Researchers have made significant progress studying semantic relationship detection, such as human-object interaction detection and visual relationship detection. We take the study of visual relationships a step further from semantic to geometric. In specific, we predict relative occlusion and relative distance relationships. However, detecting these relationships from a single image is challenging. Enforcing focused attention to task-specific regions plays a critical role in successfully detecting these relationships. In this work, (1) we propose a novel three-decoder architecture as the infrastructure for focused attention; 2) we use the generalized intersection box prediction task to effectively guide our model to focus on occlusion-specific regions; 3) our model achieves a new state-of-the-art performance on distance-aware relationship detection. Specifically, our model increases the distance F1-score from 33.8% to 38.6% and boosts the occlusion F1-score from 34.4% to 41.2%. Our code is publicly available.

中文翻译:


具有集中注意力的距离感知遮挡检测



对于人类来说,使用视觉信号理解物体之间的关系是直观的。然而,对于人工智能来说,这项任务仍然具有挑战性。研究人员在语义关系检测方面取得了重大进展,例如人与物体交互检测和视觉关系检测。我们将视觉关系的研究从语义进一步推进到几何。具体来说,我们预测相对遮挡和相对距离关系。然而,从单个图像中检测这些关系具有挑战性。加强对特定任务区域的关注对于成功检测这些关系起着至关重要的作用。在这项工作中, (1)我们提出了一种新颖的三解码器架构作为集中注意力的基础设施; 2)我们使用广义交叉框预测任务来有效指导我们的模型关注特定于遮挡的区域; 3)我们的模型在距离感知关系检测方面实现了新的最先进的性能。具体来说,我们的模型将距离 F1 分数从 33.8% 增加到 38.6%,并将遮挡 F1 分数从 34.4% 增加到 41.2%。我们的代码是公开的。
更新日期:2024-08-26
down
wechat
bug