当前位置:
X-MOL 学术
›
arXiv.cs.GL
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases
arXiv - CS - General Literature Pub Date : 2020-09-24 , DOI: arxiv-2009.11564 Gerhard Weikum, Luna Dong, Simon Razniewski, Fabian Suchanek
arXiv - CS - General Literature Pub Date : 2020-09-24 , DOI: arxiv-2009.11564 Gerhard Weikum, Luna Dong, Simon Razniewski, Fabian Suchanek
Equipping machines with comprehensive knowledge of the world's entities and
their relationships has been a long-standing goal of AI. Over the last decade,
large-scale knowledge bases, also known as knowledge graphs, have been
automatically constructed from web contents and text sources, and have become a
key asset for search engines. This machine knowledge can be harnessed to
semantically interpret textual phrases in news, social media and web tables,
and contributes to question answering, natural language processing and data
analytics. This article surveys fundamental concepts and practical methods for
creating and curating large knowledge bases. It covers models and methods for
discovering and canonicalizing entities and their semantic types and organizing
them into clean taxonomies. On top of this, the article discusses the automatic
extraction of entity-centric properties. To support the long-term life-cycle
and the quality assurance of machine knowledge, the article presents methods
for constructing open schemas and for knowledge curation. Case studies on
academic projects and industrial knowledge graphs complement the survey of
concepts and methods.
中文翻译:
机器知识:综合知识库的创建和管理
为机器配备对世界实体及其关系的全面知识是人工智能的长期目标。在过去的十年中,大规模知识库,也称为知识图谱,已经从网络内容和文本源自动构建,并已成为搜索引擎的关键资产。这种机器知识可用于从语义上解释新闻、社交媒体和网络表格中的文本短语,并有助于问答、自然语言处理和数据分析。本文调查了创建和管理大型知识库的基本概念和实用方法。它涵盖了用于发现和规范实体及其语义类型并将它们组织成干净的分类法的模型和方法。在此之上,文章讨论了以实体为中心的属性的自动提取。为了支持机器知识的长期生命周期和质量保证,本文提出了构建开放模式和知识管理的方法。学术项目和工业知识图的案例研究补充了概念和方法的调查。
更新日期:2020-09-25
中文翻译:
机器知识:综合知识库的创建和管理
为机器配备对世界实体及其关系的全面知识是人工智能的长期目标。在过去的十年中,大规模知识库,也称为知识图谱,已经从网络内容和文本源自动构建,并已成为搜索引擎的关键资产。这种机器知识可用于从语义上解释新闻、社交媒体和网络表格中的文本短语,并有助于问答、自然语言处理和数据分析。本文调查了创建和管理大型知识库的基本概念和实用方法。它涵盖了用于发现和规范实体及其语义类型并将它们组织成干净的分类法的模型和方法。在此之上,文章讨论了以实体为中心的属性的自动提取。为了支持机器知识的长期生命周期和质量保证,本文提出了构建开放模式和知识管理的方法。学术项目和工业知识图的案例研究补充了概念和方法的调查。