CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation,arXiv - CS - Multimedia

当前位置： X-MOL 学术 › arXiv.cs.MM › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation
arXiv - CS - Multimedia Pub Date : 2020-09-16 , DOI: arxiv-2009.07526
Jing Yu, Yuan Chai, Yue Hu, Qi Wu

Scene graphs are semantic abstraction of images that encourage visual understanding and reasoning. However, the performance of Scene Graph Generation (SGG) is unsatisfactory when faced with biased data in real-world scenarios. Conventional debiasing research mainly studies from the view of data representation, e.g. balancing data distribution or learning unbiased models and representations, ignoring the mechanism that how humans accomplish this task. Inspired by the role of the prefrontal cortex (PFC) in hierarchical reasoning, we analyze this problem from a novel cognition perspective: learning a hierarchical cognitive structure of the highly-biased relationships and navigating that hierarchy to locate the classes, making the tail classes receive more attention in a coarse-to-fine mode. To this end, we propose a novel Cognition Tree (CogTree) loss for unbiased SGG. We first build a cognitive structure CogTree to organize the relationships based on the prediction of a biased SGG model. The CogTree distinguishes remarkably different relationships at first and then focuses on a small portion of easily confused ones. Then, we propose a hierarchical loss specially for this cognitive structure, which supports coarse-to-fine distinction for the correct relationships while progressively eliminating the interference of irrelevant ones. The loss is model-independent and can be applied to various SGG models without extra supervision. The proposed CogTree loss consistently boosts the performance of several state-of-the-art models on the Visual Genome benchmark.

中文翻译：

CogTree：用于无偏场景图生成的认知树损失

场景图是图像的语义抽象，鼓励视觉理解和推理。然而，在现实场景中面对有偏见的数据时，场景图生成 (SGG) 的性能并不令人满意。传统的去偏研究主要从数据表示的角度进行研究，例如平衡数据分布或学习无偏模型和表示，而忽略了人类如何完成这项任务的机制。受前额叶皮层 (PFC) 在层次推理中的作用的启发，我们从一个新的认知角度分析这个问题：学习高度偏向关系的层次认知结构并导航该层次以定位类，使尾类接收在从粗到细的模式中得到更多关注。为此，我们为无偏 SGG 提出了一种新的认知树 (CogTree) 损失。我们首先建立一个认知结构 CogTree 来组织基于有偏 SGG 模型预测的关系。CogTree 首先区分明显不同的关系，然后关注一小部分容易混淆的关系。然后，我们专门针对这种认知结构提出了层次损失，它支持从粗到细区分正确关系，同时逐步消除无关关系的干扰。损失与模型无关，可以应用于各种 SGG 模型而无需额外监督。提议的 CogTree 损失持续提高了几个最先进模型在 Visual Genome 基准测试中的性能。我们首先建立一个认知结构 CogTree 来组织基于有偏 SGG 模型预测的关系。CogTree 首先区分明显不同的关系，然后关注一小部分容易混淆的关系。然后，我们专门针对这种认知结构提出了层次损失，它支持从粗到细区分正确关系，同时逐步消除无关关系的干扰。损失与模型无关，可以应用于各种 SGG 模型而无需额外监督。提议的 CogTree 损失持续提高了几个最先进模型在 Visual Genome 基准测试中的性能。我们首先建立一个认知结构 CogTree 来组织基于有偏 SGG 模型预测的关系。CogTree 首先区分明显不同的关系，然后关注一小部分容易混淆的关系。然后，我们专门针对这种认知结构提出了层次损失，它支持从粗到细区分正确关系，同时逐步消除无关关系的干扰。损失与模型无关，可以应用于各种 SGG 模型而无需额外监督。提议的 CogTree 损失持续提高了几个最先进模型在 Visual Genome 基准测试中的性能。CogTree 首先区分明显不同的关系，然后关注一小部分容易混淆的关系。然后，我们专门针对这种认知结构提出了层次损失，它支持从粗到细区分正确关系，同时逐步消除无关关系的干扰。损失与模型无关，可以应用于各种 SGG 模型而无需额外监督。提议的 CogTree 损失持续提高了几个最先进模型在 Visual Genome 基准测试中的性能。CogTree 首先区分明显不同的关系，然后关注一小部分容易混淆的关系。然后，我们专门针对这种认知结构提出了层次损失，它支持从粗到细区分正确关系，同时逐步消除无关关系的干扰。损失与模型无关，可以应用于各种 SGG 模型而无需额外监督。提议的 CogTree 损失持续提高了几个最先进模型在 Visual Genome 基准测试中的性能。损失与模型无关，可以应用于各种 SGG 模型而无需额外监督。提议的 CogTree 损失持续提高了几个最先进模型在 Visual Genome 基准测试中的性能。损失与模型无关，可以应用于各种 SGG 模型而无需额外监督。提议的 CogTree 损失持续提高了几个最先进模型在 Visual Genome 基准测试中的性能。

更新日期：2020-09-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>