当前位置: X-MOL 学术Comput. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A dictionary for translation from natural to formal data model language
Computational Intelligence ( IF 2.8 ) Pub Date : 2020-09-07 , DOI: 10.1111/coin.12393
Sabrina Šuman 1 , Alen Jakupović , Mladen Marinac
Affiliation  

The paper describes our current research activities and results related to developing knowledge‐based systems to support the creation of entity‐relationship (ER) models. The authors based obtaining an ER model in textual form on translation from one language into another, that is, from an English controlled natural language into the formalized language of an ER data model. Our translation method consisted of creating translation rules of sentential form parts into ER model constructs based on the textual and character patterns detected in the business descriptions. To enable the computer analyses necessary for creating translation mechanisms, we created a linguistic corpus that contains lists of the business descriptions and the texts of other business materials. From the corpus, we then created a specific dictionary and linguistic rules to automate the business descriptions' translation into the ER data model language. Before that, however, the corpus was enriched by adding annotations to the words related to ER data model constructs. In this paper, we also present the main issues uncovered during the translation process and offer a possible solution with utility evaluation: applying information‐extraction performance measures to a set of sentences from the corpus.

中文翻译:

从自然到正式数据模型语言的翻译词典

本文介绍了我们当前的研究活动以及与开发基于知识的系统以支持创建实体关系(ER)模型有关的结果。作者基于从一种语言到另一种语言的翻译,即从英语控制的自然语言到ER数据模型的形式化语言的翻译,以文本形式获得ER模型。我们的翻译方法包括根据业务描述中检测到的文字和字符模式,将句子形式部分的翻译规则创建到ER模型构造中。为了能够进行创建翻译机制所必需的计算机分析,我们创建了一个语言语料库,其中包含业务描述列表和其他业务材料的文本。从语料库 然后,我们创建了特定的词典和语言规则,以将业务描述的翻译自动转换为ER数据模型语言。但是,在此之前,通过向与ER数据模型构造相关的单词添加注释来丰富语料库。在本文中,我们还将介绍翻译过程中发现的主要问题,并为实用程序评估提供一种可能的解决方案:将信息提取性能度量应用于来自语料库的一组句子。
更新日期:2020-09-07
down
wechat
bug