当前位置: X-MOL 学术Database J. Biol. Databases Curation › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Applying graph database technology for analyzing perturbed co-expression networks in cancer
Database: The Journal of Biological Databases and Curation ( IF 5.8 ) Pub Date : 2020-12-11 , DOI: 10.1093/database/baaa110
Claire M Simpson 1 , Florian Gnad 1
Affiliation  

Graph representations provide an elegant solution to capture and analyze complex molecular mechanisms in the cell. Co-expression networks are undirected graph representations of transcriptional co-behavior indicating (co-)regulations, functional modules or even physical interactions between the corresponding gene products. The growing avalanche of available RNA sequencing (RNAseq) data fuels the construction of such networks, which are usually stored in relational databases like most other biological data. Inferring linkage by recursive multiple-join statements, however, is computationally expensive and complex to design in relational databases. In contrast, graph databases store and represent complex interconnected data as nodes, edges and properties, making it fast and intuitive to query and analyze relationships. While graph-based database technologies are on their way from a fringe domain to going mainstream, there are only a few studies reporting their application to biological data. We used the graph database management system Neo4j to store and analyze co-expression networks derived from RNAseq data from The Cancer Genome Atlas. Comparing co-expression in tumors versus healthy tissues in six cancer types revealed significant perturbation tracing back to erroneous or rewired gene regulation. Applying centrality, community detection and pathfinding graph algorithms uncovered the destruction or creation of central nodes, modules and relationships in co-expression networks of tumors. Given the speed, accuracy and straightforwardness of managing these densely connected networks, we conclude that graph databases are ready for entering the arena of biological data.

中文翻译:

应用图数据库技术分析癌症中的扰动共表达网络

图形表示提供了一种优雅的解决方案来捕获和分析细胞中复杂的分子机制。共表达网络是转录共行为的无向图表示,指示(共)调节、功能模块甚至相应基因产物之间的物理相互作用。越来越多的可用 RNA 测序 (RNAseq) 数据推动了此类网络的构建,这些网络通常像大多数其他生物数据一样存储在关系数据库中。然而,通过递归多连接语句推断链接在计算上很昂贵,而且在关系数据库中设计起来很复杂。相比之下,图形数据库将复杂的互连数据存储并表示为节点、边和属性,从而可以快速、直观地查询和分析关系。虽然基于图的数据库技术正在从边缘领域走向主流,但只有少数研究报告了它们在生物数据中的应用。我们使用图形数据库管理系统 Neo4j 来存储和分析源自癌症基因组图谱的 RNAseq 数据的共表达网络。比较六种癌症类型中肿瘤与健康组织中的共表达揭示了显着扰动可追溯到错误或重新连接的基因调控。应用中心性、社区检测和寻路图算法揭示了肿瘤共表达网络中中心节点、模块和关系的破坏或创建。鉴于管理这些密集连接的网络的速度、准确性和直接性,
更新日期:2020-12-11
down
wechat
bug