当前位置: X-MOL 学术Data Knowl. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multilingual Verbalization and Summarization for Explainable Link Discovery
Data & Knowledge Engineering ( IF 2.5 ) Pub Date : 2021-02-26 , DOI: 10.1016/j.datak.2021.101874
Abdullah Fathi Ahmed , Mohamed Ahmed Sherif , Diego Moussallem , Axel-Cyrille Ngonga Ngomo

The number and size of datasets abiding by the Linked Data paradigm increase every day. Discovering links between these datasets is thus central to achieving the vision behind the Data Web. Declarative Link Discovery (LD) frameworks rely on complex Link Specification (LS) to express the conditions under which two resources should be linked. Understanding such LS is not a trivial task for non-expert users. Particularly when such users are interested in generating LS to match their needs. Even if the user applies a machine learning algorithm for the automatic generation of the required LS, the challenge of explaining the resultant LS persists. Hence, providing explainable LS is the key challenge to enable users who are unfamiliar with underlying LS technologies to use them effectively and efficiently. In this paper, we extend our previous work (Ahmed et al., 2019) by proposing a generic multilingual approach that allows verbalization of LS in many languages, i.e., converts LS into understandable natural language text. In this work, we ported our LS verbalization framework into German and Spanish, in addition to English language. Our adequacy and fluency evaluations show that our approach can generate complete and easily understandable natural language descriptions even by lay users. Moreover, we devised an experimental neural approach for improving the quality of our generated texts. Our neural approach achieves promising results in terms of BLEU, METEOR and chrF++.



中文翻译:

多语种语言描述和摘要可解释的链接发现

遵循链接数据范式的数据集的数量和大小每天都在增加。因此,发现这些数据集之间的链接对于实现数据网络背后的愿景至关重要。声明性链接发现(LD)框架依赖复杂的链接规范(LS)来表达应链接两个资源的条件。对于非专家用户而言,了解这样的LS并非易事。尤其是当此类用户对生成LS以满足他们的需求感兴趣时。即使用户将机器学习算法应用于所需LS的自动生成,解释结果LS的挑战仍然存在。因此,提供可解释的LS是使不熟悉基础LS技术的用户能够有效地使用它们的关键挑战。在本文中,我们通过提出一种通用的多语言方法来扩展我们以前的工作(Ahmed等,2019),该方法允许以多种语言对LS进行语言化,即将LS转换为可理解的自然语言文本。在这项工作中,除了英语,我们还将LS语言化框架移植到了德语和西班牙语中。我们的充分性和流利性评估表明,即使是外行用户,我们的方法也可以生成完整且易于理解的自然语言描述。此外,我们设计了一种实验性的神经方法来提高生成文本的质量。我们的神经方法在BLEU,METEOR和chrF ++方面取得了令人鼓舞的结果。除了英语,我们还将LS语言表达框架移植到了德语和西班牙语中。我们的充分性和流利性评估表明,即使是外行用户,我们的方法也可以生成完整且易于理解的自然语言描述。此外,我们设计了一种实验性的神经方法来提高生成文本的质量。我们的神经方法在BLEU,METEOR和chrF ++方面取得了令人鼓舞的结果。除了英语,我们还将LS语言表达框架移植到了德语和西班牙语中。我们的充分性和流利性评估表明,即使是外行用户,我们的方法也可以生成完整且易于理解的自然语言描述。此外,我们设计了一种实验性的神经方法来提高生成文本的质量。我们的神经方法在BLEU,METEOR和chrF ++方面取得了令人鼓舞的结果。

更新日期:2021-03-18
down
wechat
bug