当前位置: X-MOL 学术bioRxiv. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A protocol for adding knowledge to Wikidata, a case report
bioRxiv - Bioinformatics Pub Date : 2020-06-04 , DOI: 10.1101/2020.04.05.026336
Andra Waagmeester , Egon L. Willighagen , Andrew I. Su , Martina Kutmon , Jose Emilio Labra Gayo , Daniel Fernández-Álvarez , Quentin Groom , Peter J. Schaap , Lisa M. Verhagen , Jasper J. Koehorst

Pandemics, even more than other medical problems, require swift integration of knowledge. When caused by a new virus, understanding the underlying biology may help finding solutions. In a setting where there are a large number of loosely related projects and initiatives, we need common ground, also known as a "commons". Wikidata, a public knowledge graph aligned with Wikipedia, is such a commons and uses unique identifiers to link knowledge in other knowledge bases However, Wikidata may not always have the right schema for the urgent questions. In this paper, we address this problem by showing how a data schema required for the integration can be modelled with entity schemas represented by Shape Expressions. As a telling example, we describe the process of aligning resources on the genomes and proteomes of the SARS-CoV-2 virus and related viruses as well as how Shape Expressions can be defined for Wikidata to model the knowledge, helping others studying the SARS-CoV-2 pandemic. How this model can be used to make data between various resources interoperable, is demonstrated by integrating data from NCBI Taxonomy, NCBI Genes, UniProt, and WikiPathways. Based on that model, a set of automated applications or bots were written for regular updates of these sources in Wikidata and added to a platform for automatically running these updates. Although this workflow is developed and applied in the context of the COVID-19 pandemic, to demonstrate its broader applicability it was also applied to other human coronaviruses (MERS, SARS, Human Coronavirus NL63, Human coronavirus 229E, Human coronavirus HKU1, Human coronavirus OC4).

中文翻译:

向Wikidata添加知识的协议,案例报告

大流行病比其他医学问题更为严重,需要迅速整合知识。当由新病毒引起时,了解基本生物学可能有助于寻找解决方案。在存在大量松散相关的项目和计划的环境中,我们需要共同点,也被称为“共同点”。Wikidata是与Wikipedia保持一致的公共知识图,它是一个共同点,并使用唯一的标识符链接其他知识库中的知识。但是,Wikidata可能并不总是具有针对紧急问题的正确方案。在本文中,我们通过显示如何使用Shape Expressions表示的实体模式对集成所需的数据模式进行建模来解决此问题。举个例子 我们描述了在SARS-CoV-2病毒和相关病毒的基因组和蛋白质组上对齐资源的过程,以及如何为Wikidata定义Shape Expressions来建模知识,从而帮助其他人研究SARS-CoV-2大流行。通过集成来自NCBI分类法,NCBI基因,UniProt和WikiPathways的数据,展示了如何使用此模型使各种资源之间的数据可互操作。基于该模型,为Wikidata中的这些源的常规更新编写了一组自动化应用程序或漫游器,并将它们添加到一个平台上以自动运行这些更新。尽管此工作流程是在COVID-19大流行的背景下开发和应用的,但为了证明其更广泛的适用性,它也被应用于其他人类冠状病毒(MERS,SARS,人类冠状病毒NL63,人类冠状病毒229E,
更新日期:2020-06-04
down
wechat
bug