当前位置: X-MOL 学术Glycobiology › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Enhancing the interoperability of glycan data flow between ChEBI, PubChem and GlyGen
Glycobiology ( IF 4.3 ) Pub Date : 2021-07-26 , DOI: 10.1093/glycob/cwab078
Rahi Navelkar 1 , Gareth Owen 2 , Venkatesh Mutherkrishnan 3 , Paul Thiessen 3 , Tiejun Cheng 3 , Evan Bolton 3 , Nathan Edwards 4 , Michael Tiemeyer 5 , Matthew P Campbell 6 , Maria Martin 7 , Jeet Vora 1 , Robel Kahsay 1 , Raja Mazumder 1
Affiliation  

Abstract
Glycans play a vital role in health, disease, bioenergy, biomaterials and bio-therapeutics. As a result, there is keen interest to identify and increase glycan data in bioinformatics databases like ChEBI and PubChem, and connecting them to resources at the EMBL-EBI and NCBI to facilitate access to important annotations at a global level. GlyTouCan is a comprehensive archival database that contains glycans obtained primarily through batch upload from glycan repositories, glycoprotein databases and individual laboratories. In many instances, the glycan structures deposited in GlyTouCan may not be fully defined or have supporting experimental evidence and citations. Databases like ChEBI and PubChem were designed to accommodate complete atomistic structures with well-defined chemical linkages. As a result, they cannot easily accommodate the structural ambiguity inherent in glycan databases. Consequently, there is a need to improve the organization of glycan data coherently to enhance connectivity across the major NCBI, EMBL-EBI and glycoscience databases.This paper outlines a workflow developed in collaboration between GlyGen, ChEBI and PubChem to improve the visibility and connectivity of glycan data across these resources. GlyGen hosts a subset of glycans (~29,000) from the GlyTouCan database and has submitted valuable glycan annotations to the PubChem database and integrated over 10,500 (including ambiguously defined) glycans into the ChEBI database. The integrated glycans were prioritized based on links to PubChem and connectivity to glycoprotein data. The pipeline provides a blueprint for how glycan data can be harmonized between different resources. The current PubChem, ChEBI and GlyTouCan mappings can be downloaded from GlyGen (https://data.glygen.org).


中文翻译:

增强 ChEBI、PubChem 和 GlyGen 之间聚糖数据流的互操作性

摘要
聚糖在健康、疾病、生物能源、生物材料和生物治疗中发挥着至关重要的作用。因此,人们非常有兴趣识别和增加生物信息学数据库(如 ChEBI 和 PubChem)中的聚糖数据,并将它们连接到 EMBL-EBI 和 NCBI 的资源,以促进在全球范围内访问重要注释。GlyTouCan 是一个综合档案数据库,其中包含主要通过从聚糖库、糖蛋白数据库和各个实验室批量上传获得的聚糖。在许多情况下,存放在 GlyTouCan 中的聚糖结构可能没有完全定义或没有支持的实验证据和引用。ChEBI 和 PubChem 等数据库旨在容纳具有明确化学键的完整原子结构。因此,它们不能轻易适应聚糖数据库中固有的结构模糊性。因此,需要连贯地改进聚糖数据的组织,以增强主要 NCBI、EMBL-EBI 和糖科学数据库之间的连接性。本文概述了 GlyGen、ChEBI 和 PubChem 合作开发的工作流程,以提高数据的可见性和连接性。这些资源中的聚糖数据。GlyGen 拥有来自 GlyTouCan 数据库的聚糖子集(约 29,000 个),并向 PubChem 数据库提交了有价值的聚糖注释,并将超过 10,500 个(包括定义不明确的)聚糖集成到了 ChEBI 数据库中。根据与 PubChem 的链接和与糖蛋白数据的连接,对整合的聚糖进行优先排序。该管道为如何在不同资源之间协调聚糖数据提供了蓝图。当前的 PubChem、ChEBI 和 GlyTouCan 映射可以从 GlyGen (https://data.glygen.org) 下载。
更新日期:2021-07-26
down
wechat
bug