当前位置: X-MOL 学术Pattern Recogn. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Knowledge base graph embedding module design for Visual question answering model
Pattern Recognition ( IF 8 ) Pub Date : 2021-07-13 , DOI: 10.1016/j.patcog.2021.108153
Wenfeng Zheng 1 , Lirong Yin 2 , Xiaobing Chen 1 , Zhiyang Ma 1 , Shan Liu 1 , Bo Yang 1
Affiliation  

In this paper, a knowledge base graph embedding module is constructed to extend the versatility of knowledge-based VQA (Visual Question Answering) models. The knowledge base graph embedding module constructed in this paper extracts core entities from images and text, and maps them as knowledge base entities, then extracts the sub-graphs closely related to the core entities, and converts the sub-graphs into low-dimensional vectors to realize sub-graph embedding. In order to achieve good subgraph embedding, we first extracted two experimental knowledge bases with rich semantics from DBpedia: DBV and DBA. Based on these two knowledge bases, this paper selects several excellent models in knowledge base embedding as test models, including SE (structured embedding),SME(semantic matching energy function), and TransE model to produce link prediction. The results show that there is a clear correspondence between the entities of the DBV, which can achieve excellent node embedding. And the TransE model can achieve a good knowledge base embedding, so we built the knowledge base graph embedding module based on TransE. And then we construct a VQA model (KBSN) based on the knowledge base graph embedding. Experimental results on VQA2.0 and KB-VQA data sets prove that the knowledge base graph embedding module improves the accuracy.



中文翻译:

视觉问答模型的知识库图嵌入模块设计

在本文中,构建了一个知识库图嵌入模块来扩展基于知识的 VQA(视觉问答)模型的多功能性。本文构建的知识库图嵌入模块从图像和文本中提取核心实体,并将其映射为知识库实体,然后提取与核心实体密切相关的子图,并将子图转换为低维向量实现子图嵌入。为了实现良好的子图嵌入,我们首先从 DBpedia 中提取了两个语义丰富的实验知识库:DBV 和 DBA。基于这两个知识库,本文选择了知识库嵌入中的几个优秀模型作为测试模型,包括SE(结构化嵌入)、SME(语义匹配能量函数)和TransE模型来产生链路预测。结果表明,DBV的实体之间有明确的对应关系,可以实现优秀的节点嵌入。并且TransE模型可以实现很好的知识库嵌入,因此我们基于TransE构建了知识库图嵌入模块。然后我们构建了一个 VQA 模型(KBSN ) 基于知识库图嵌入。在 VQA2.0 和 KB-VQA 数据集上的实验结果证明知识库图嵌入模块提高了准确性。

更新日期:2021-07-18
down
wechat
bug