当前位置: X-MOL 学术Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multi-instance learning of pretopological spaces to model complex propagation phenomena: Application to lexical taxonomy learning
Artificial Intelligence ( IF 14.4 ) Pub Date : 2021-07-16 , DOI: 10.1016/j.artint.2021.103556
G. Caillaut 1, 2 , G. Cleuziou 1, 3
Affiliation  

This paper addresses the problem of learning the concept of propagation in the theoretical formalism of pretopology, and then applying this methodology for the well-known problem of learning Lexical Taxonomy. The theory of pretopology, among others, aims at modeling complex relations between sets of entities. The use of such fine-grained modeling implies limitations in terms of scalability. However, it allows for a more accurate capture of real-world relationships, such as the hypernymy relation, by modeling the task of relation extraction as a propagation model under certain structuring constraints, as opposed to traditional approaches that are limited to detecting relations between pairs of elements without considering knowledge on the expected structuring.

Our proposal is to define the pseudo-closure operator (modeling the concept of propagation) as a logical combination of heterogeneous neighborhoods, or sources. It allows the learning of models that exploit, for example, the knowledge acquired by both statistical and numerical approaches. We show that the learning of such an operator falls into the Multiple Instance (MI) framework, where the learning process is performed on bags of instances instead of individual instances. Although this framework is well suited for this task, using it for learning a pretopological space leads to a set of bags whose size is exponential. To overcome this problem, we propose a learning method (LPSMI) based on a low estimate of the bags covered by a concept under construction.

We first propose an experimental validation of our method, through the simulation of percolation processes (typically forest fires) learned with pretopological propagation models. It reveals that the proposed MI approach is particularly efficient on propagation model recognition task. We then provide a real-world contribution to the Lexical Taxonomy learning task, by modeling this task as a complex (semantic) propagation problem. We propose a very generic framework for training models combining various existing methods for learning Lexical Taxonomies (statistical, pattern-based and embedding-based).



中文翻译:

前拓扑空间的多实例学习以模拟复杂的传播现象:在词汇分类学学习中的应用

本文解决了在前拓扑的理论形式中学习传播概念的问题,然后将该方法应用于学习词汇分类法的著名问题。除其他外,预拓扑理论旨在对实体集之间的复杂关系进行建模。使用这种细粒度建模意味着可扩展性方面的限制。然而,它允许更准确地捕获现实世界的关系,例如上位关系,通过将关系提取的任务建模为特定结构约束下的传播模型,而不是仅限于检测对之间关系的传统方法不考虑有关预期结构的知识的元素。

我们的建议是将伪闭包算子(传播概念建模)定义为异构邻域或源的逻辑组合。它允许学习利用例如通过统计和数值方法获得的知识的模型。我们表明,这种算子的学习属于多实例 (MI) 框架,其中学习过程是在实例袋上而不是单个实例上执行的。尽管此框架非常适合此任务,但将其用于学习预拓扑空间会导致一组大小呈指数级的袋子。为了克服这个问题,我们提出了一种学习方法(LPSMI),它基于对正在构建的概念所涵盖的袋子的低估计。

我们首先提出对我们的方法进行实验验证,通过模拟使用前拓扑传播模型学习的渗透过程(通常是森林火灾)。它表明所提出的 MI 方法在传播模型识别任务上特别有效。然后,我们通过将此任务建模为复杂(语义)传播问题,为词法分类法学习任务提供了现实世界的贡献。我们提出了一个非常通用的训练模型框架,结合了各种现有的词汇分类法学习方法(统计的、基于模式的和基于嵌入的)。

更新日期:2021-08-20
down
wechat
bug