当前位置: X-MOL 学术Inf. Process. Manag. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep ranking based cost-sensitive multi-label learning for distant supervision relation extraction
Information Processing & Management ( IF 7.4 ) Pub Date : 2019-08-26 , DOI: 10.1016/j.ipm.2019.102096
Hai Ye , Zhunchen Luo

Knowledge base provides a potential way to improve the intelligence of information retrieval (IR) systems, for that knowledge base has numerous relations between entities which can help the IR systems to conduct inference from one entity to another entity. Relation extraction is one of the fundamental techniques to construct a knowledge base. Distant supervision is a semi-supervised learning method for relation extraction which learns with labeled and unlabeled data. However, this approach suffers the problem of relation overlapping in which one entity tuple may have multiple relation facts. We believe that relation types can have latent connections, which we call class ties, and can be exploited to enhance relation extraction. However, this property between relation classes has not been fully explored before. In this paper, to exploit class ties between relations to improve relation extraction, we propose a general ranking based multi-label learning framework combined with convolutional neural networks, in which ranking based loss functions with regularization technique are introduced to learn the latent connections between relations. Furthermore, to deal with the problem of class imbalance in distant supervision relation extraction, we further adopt cost-sensitive learning to rescale the costs from the positive and negative labels. Extensive experiments on a widely used dataset show the effectiveness of our model to exploit class ties and to relieve class imbalance problem.



中文翻译:

基于深度排序的成本敏感多标签学习用于远程监管关系提取

知识库提供了一种潜在的方式来提高信息检索(IR)系统的智能,因为该知识库在实体之间具有大量关系,可以帮助IR系统从一个实体向另一实体进行推理。关系提取是构建知识库的基本技术之一。远程监督是一种用于关系提取的半监督学习方法,可以使用标记和未标记的数据进行学习。但是,这种方法存在关系重叠的问题,其中一个实体元组可能具有多个关系事实。我们认为关系类型可以具有潜在的联系,我们称之为类联系,可以用来增强关系提取。但是,关系类之间的此属性之前尚未得到充分探讨。在本文中,为了利用关系之间的类关系来改善关系提取,我们提出了一种结合卷积神经网络的基于排名的多标签学习框架,其中引入了基于排名的损失函数和正则化技术来学习关系之间的潜在联系。 。此外,为了解决远程监管关系提取中的类不平衡问题,我们进一步采用成本敏感型学习方法,从正负标签中重新调整成本。在广泛使用的数据集上进行的大量实验表明,我们的模型有效地利用了班级联系并缓解了班级失衡问题。

更新日期:2020-04-21
down
wechat
bug