Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases,arXiv - CS - Databases

当前位置： X-MOL 学术 › arXiv.cs.DB › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases
arXiv - CS - Databases Pub Date : 2020-09-24 , DOI: arxiv-2009.11564
Gerhard Weikum, Luna Dong, Simon Razniewski, Fabian Suchanek

Equipping machines with comprehensive knowledge of the world's entities and their relationships has been a long-standing goal of AI. Over the last decade, large-scale knowledge bases, also known as knowledge graphs, have been automatically constructed from web contents and text sources, and have become a key asset for search engines. This machine knowledge can be harnessed to semantically interpret textual phrases in news, social media and web tables, and contributes to question answering, natural language processing and data analytics. This article surveys fundamental concepts and practical methods for creating and curating large knowledge bases. It covers models and methods for discovering and canonicalizing entities and their semantic types and organizing them into clean taxonomies. On top of this, the article discusses the automatic extraction of entity-centric properties. To support the long-term life-cycle and the quality assurance of machine knowledge, the article presents methods for constructing open schemas and for knowledge curation. Case studies on academic projects and industrial knowledge graphs complement the survey of concepts and methods.

中文翻译：

机器知识：综合知识库的创建和管理

为机器配备对世界实体及其关系的全面知识是人工智能的长期目标。在过去的十年中，大规模知识库，也称为知识图谱，已经从网络内容和文本源自动构建，并已成为搜索引擎的关键资产。这种机器知识可用于从语义上解释新闻、社交媒体和网络表格中的文本短语，并有助于问答、自然语言处理和数据分析。本文调查了创建和管理大型知识库的基本概念和实用方法。它涵盖了用于发现和规范实体及其语义类型并将它们组织成干净的分类法的模型和方法。在此之上，文章讨论了以实体为中心的属性的自动提取。为了支持机器知识的长期生命周期和质量保证，本文提出了构建开放模式和知识管理的方法。学术项目和工业知识图的案例研究补充了概念和方法的调查。

更新日期：2020-09-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>