当前位置: X-MOL 学术Semant. Web › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An unsupervised approach to disjointness learning based on terminological cluster trees
Semantic Web ( IF 3.0 ) Pub Date : 2020-09-22 , DOI: 10.3233/sw-200391
Giuseppe Rizzo 1 , Claudia d’Amato 1 , Nicola Fanizzi 1
Affiliation  

In the context of the Semantic Web regarded as a Web of Data, research efforts have been devoted to improving the quality of the ontologies that are used as vocabularies to enable complex services based on automated reasoning. From various surveys it emerges that many domains would require better ontologies that include non-negligible constraints for properly conveying the intended semantics. In this respect, disjointness axioms are representative of this general problem: these axioms are essential for making the negative knowledge about the domain of interest explicit yet they are often overlooked during the modeling process (thus affecting the efficacy of the reasoning services). To tackle this problem, automated methods for discovering these axioms can be used as a tool for supporting knowledge engineers in modeling new ontologies or evolving existing ones. The current solutions, either based on statistical correlations or relying on external corpora, often do not fully exploit the terminology. Stemming from this consideration, we have been investigating on alternative methods to elicit disjointness axioms from existing ontologies based on the induction of terminological cluster trees, which are logic trees in which each node stands for a cluster of individuals which emerges as a sub-concept. The growth of such trees relies on a divide-and-conquer procedure that assigns, for the cluster representing the root node, one of the concept descriptions generated via a refinement operator and selected according to a heuristic based on the minimization of the risk of overlap between the candidate sub-clusters (quantified in terms of the distance between two prototypical individuals). Preliminary works have showed some shortcomings that are tackled in this paper. To tackle the task of disjointness axioms discovery we have extended the terminological cluster tree induction framework with various contributions: 1) the adoption of different distance measures for clustering the individuals of a knowledge base; 2) the adoption of different heuristics for selecting the most promising concept descriptions; 3) a modified version of the refinement operator to prevent the introduction of inconsistency during the elicitation of the new axioms. A wide empirical evaluation showed the feasibility of the proposed extensions and the improvement with respect to alternative approaches.

中文翻译:

基于术语聚类树的无监督脱节学习方法

在语义网被视为数据网的情况下,研究工作已致力于提高用作词汇表的本体的质量,以使基于自动推理的复杂服务成为可能。从各种调查中发现,许多领域将需要更好的本体,其中包括不可忽略的约束,以正确传达预期的语义。在这方面,脱节公理代表了这个普遍的问题:这些公理对于使对所关注领域的负面知识很明确至关重要,但在建模过程中常常被忽略(因此影响了推理服务的效率)。为了解决这个问题,用于发现这些公理的自动化方法可以用作支持知识工程师对新本体建模或对现有本体进行建模的工具。当前的解决方案,无论是基于统计相关性还是依赖于外部语料库,通常都无法完全利用该术语。基于这种考虑,我们一直在研究基于术语簇树的归纳法从现有本体中得出不相交公理的替代方法,术语簇树是逻辑树,其中每个节点代表作为子概念出现的个人簇。这样的树的增长依赖于分而治之的过程,该过程为代表根节点的集群分配了 通过细化运算符生成的概念描述之一,并根据试探法基于候选子类之间的重叠风险最小化(根据两个原型个体之间的距离进行了量化)进行选择。初步工作表明了本文中要解决的一些缺陷。为了解决不连续性公理发现的任务,我们扩展了术语聚类树归纳框架,并做出了许多贡献:1)采用不同的距离度量来聚类知识库的各个个体;2)采用不同的启发式方法来选择最有希望的概念描述;3)精简运算符的修改版本,以防止在引发新公理期间引入不一致。
更新日期:2020-09-23
down
wechat
bug