当前位置: X-MOL 学术Genome Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Pathway information extracted from 25 years of pathway figures
Genome Biology ( IF 10.1 ) Pub Date : 2020-11-09 , DOI: 10.1186/s13059-020-02181-2
Kristina Hanspers 1 , Anders Riutta 1 , Martina Summer-Kutmon 2, 3 , Alexander R Pico 1
Affiliation  

Thousands of pathway diagrams are published each year as static figures inaccessible to computational queries and analyses. Using a combination of machine learning, optical character recognition, and manual curation, we identified 64,643 pathway figures published between 1995 and 2019 and extracted 1,112,551 instances of human genes, comprising 13,464 unique NCBI genes, participating in a wide variety of biological processes. This collection represents an order of magnitude more genes than found in the text of the same papers, and thousands of genes missing from other pathway databases, thus presenting new opportunities for discovery and research.

中文翻译:


从 25 年的路径数据中提取的路径信息



每年都会以静态图形式发布数千个路径图,无法进行计算查询和分析。通过结合机器学习、光学字符识别和手动管理,我们识别了 1995 年至 2019 年间发表的 64,643 个路径图,并提取了 1,112,551 个人类基因实例,其中包括 13,464 个独特的 NCBI 基因,参与多种生物过程。该集合代表的基因比同一篇论文文本中发现的基因多一个数量级,以及其他通路数据库中缺失的数千个基因,从而为发现和研究提供了新的机会。
更新日期:2020-11-09
down
wechat
bug