当前位置: X-MOL 学术Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
VoCSK: Verb-oriented commonsense knowledge mining with taxonomy-guided induction
Artificial Intelligence ( IF 14.4 ) Pub Date : 2022-06-08 , DOI: 10.1016/j.artint.2022.103744
Jingping Liu , Tao Chen , Chao Wang , Jiaqing Liang , Lihan Chen , Yanghua Xiao , Yunwen Chen , Ke Jin

Commonsense knowledge acquisition is one of the fundamental issues in realizing human-level AI. However, commonsense knowledge is difficult to obtain because it is a human consensus and rarely explicitly appears in texts or other data. In this paper, we focus on the automatic acquisition of a typical kind of implicit verb-oriented commonsense knowledge (e.g., “person eats food”), which is the concept-level knowledge of verb phrases. For this purpose, we propose a taxonomy-guided induction method to mine verb-oriented commonsense knowledge from verb phrases with the help of a probabilistic taxonomy. First, we design an entropy-based triplet filter to cope with noisy verb phrases. Then, we propose a joint model based on the minimum description length principle and a neural language model to generate verb-oriented commonsense knowledge. Besides, we introduce two strategies to accelerate the computation, including the simulated annealing-based approximate solution and the verb phrase clustering method. Finally, we conduct extensive experiments to prove that our solution is more effective than competitors in mining verb-oriented commonsense knowledge. We construct a commonsense knowledge base called VoCSK, containing 259 verbs and 18,406 verb-oriented commonsense knowledge. To verify the usefulness of VoCSK, we utilize the knowledge in this KB to improve the model performance on two downstream applications.



中文翻译:

VoCSK:分类引导归纳的面向动词的常识知识挖掘

常识性知识获取是实现人类级人工智能的基本问题之一。然而,常识性知识很难获得,因为它是人类的共识,很少明确地出现在文本或其他数据中。在本文中,我们专注于自动获取一种典型的隐含动词导向的常识知识(例如,“人吃食物”),这是动词短语的概念级知识。为此,我们提出了一种分类引导的归纳方法,在概率分类的帮助下从动词短语中挖掘面向动词的常识知识。首先,我们设计了一个基于熵的三元组过滤器来处理嘈杂的动词短语。然后,我们提出了一个基于最小描述长度原则的联合模型和一个神经语言模型来生成面向动词的常识知识。此外,我们引入了两种加速计算的策略,包括基于模拟退火的近似解和动词短语聚类方法。最后,我们进行了广泛的实验,以证明我们的解决方案在挖掘面向动词的常识知识方面比竞争对手更有效。我们构建了一个名为 VoCSK 的常识知识库,包含259个动词和18,406个面向动词的常识知识。为了验证 VoCSK 的有用性,我们利用此 KB 中的知识来提高两个下游应用程序的模型性能。

更新日期:2022-06-08
down
wechat
bug