Mining Interpretable AOG Representations From Convolutional Networks via Active Question Answering,IEEE Transactions on Pattern Analysis and Machine Intelligence

当前位置： X-MOL 学术 › IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Mining Interpretable AOG Representations From Convolutional Networks via Active Question Answering
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 20.8 ) Pub Date : 5-7-2020 , DOI: 10.1109/tpami.2020.2993147
Quanshi Zhang , Jie Ren , Ge Huang , Ruiming Cao , Ying Nian Wu , Song-Chun Zhu

In this paper, we present a method to mine object-part patterns from conv-layers of a pre-trained convolutional neural network (CNN). The mined object-part patterns are organized by an And-Or graph (AOG). This interpretable AOG representation consists of a four-layer semantic hierarchy, i.e., semantic parts, part templates, latent patterns, and neural units. The AOG associates each object part with certain neural units in feature maps of conv-layers. The AOG is constructed with very few annotations (e.g., 3_20) of object parts. We develop a question-answering (QA) method that uses active human-computer communications to mine patterns from a pre-trained CNN, in order to explain features in conv-layers incrementally. During the learning process, our QA method uses the current AOG for part localization. The QA method actively identifies objects, whose feature maps cannot be explained by the AOG. Then, our method asks people to annotate parts on the unexplained objects, and uses answers to discover CNN patterns corresponding to newly labeled parts. In this way, our method gradually grows new branches and refines existing branches on the AOG to semanticize CNN representations. In experiments, our method exhibited a high learning efficiency. Our method used about 1/61/6_1/31/3 of the part annotations for training, but achieved similar or better part-localization performance than fast-RCNN methods.

中文翻译：

通过主动问答从卷积网络中挖掘可解释的 AOG 表示

在本文中，我们提出了一种从预训练的卷积神经网络（CNN）的卷积层中挖掘对象部分模式的方法。挖掘的对象部分模式由与或图（AOG）组织。这种可解释的 AOG 表示由四层语义层次结构组成，即语义部分、部分模板、潜在模式和神经单元。 AOG 将每个对象部分与卷积层特征图中的某些神经单元相关联。 AOG 是用很少的对象部分注释（例如，3_20）构建的。我们开发了一种问答 (QA) 方法，该方法使用主动人机通信从预先训练的 CNN 中挖掘模式，以便逐步解释卷积层中的特征。在学习过程中，我们的 QA 方法使用当前的 AOG 进行零件本地化。 QA 方法主动识别 AOG 无法解释其特征图的对象。然后，我们的方法要求人们对无法解释的对象进行注释，并使用答案来发现与新标记的部分相对应的 CNN 模式。通过这种方式，我们的方法逐渐在 AOG 上生长新分支并细化现有分支，以语义化 CNN 表示。在实验中，我们的方法表现出很高的学习效率。我们的方法使用了大约 1/61/6_1/31/3 的零件注释进行训练，但取得了与 fast-RCNN 方法相似或更好的零件定位性能。

更新日期：2024-08-22

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11