当前位置: X-MOL 学术Stat. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A graph convolutional neural network for gene expression data analysis with multiple gene networks
Statistics in Medicine ( IF 1.8 ) Pub Date : 2021-07-14 , DOI: 10.1002/sim.9140
Hu Yang 1 , Zhong Zhuang 2 , Wei Pan 3
Affiliation  

Spectral graph convolutional neural networks (GCN) are proposed to incorporate important information contained in graphs such as gene networks. In a standard spectral GCN, there is only one gene network to describe the relationships among genes. However, for genomic applications, due to condition- or tissue-specific gene function and regulation, multiple gene networks may be available; it is unclear how to apply GCNs to disease classification with multiple networks. Besides, which gene networks may provide more effective prior information for a given learning task is unknown a priori and is not straightforward to discover in many cases. A deep multiple graph convolutional neural network is therefore developed here to meet the challenge. The new approach not only computes a feature of a gene as the weighted average of those of itself and its neighbors through spectral GCNs, but also extracts features from gene-specific expression (or other feature) profiles via a feed-forward neural networks (FNN). We also provide two measures, the importance of a given gene and the relative importance score of each gene network, for the genes' and gene networks' contributions, respectively, to the learning task. To evaluate the new method, we conduct real data analyses using several breast cancer and diffuse large B-cell lymphoma datasets and incorporating multiple gene networks obtained from “GIANT 2.0” Compared with the standard FNN, GCN, and random forest, the new method not only yields high classification accuracy but also prioritizes the most important genes confirmed to be highly associated with cancer, strongly suggesting the usefulness of the new method in incorporating multiple gene networks.

中文翻译:

一种用于多基因网络基因表达数据分析的图卷积神经网络

提出了谱图卷积神经网络 (GCN) 来合并图中包含的重要信息,例如基因网络。在标准的光谱 GCN 中,只有一个基因网络来描述基因之间的关系。然而,对于基因组应用,由于条件或组织特异性基因功能和调节,可能有多个基因网络可用;目前尚不清楚如何将 GCN 应用于具有多个网络的疾病分类。此外,哪些基因网络可以为给定的学习任务提供更有效的先验信息是先验未知的,并且在许多情况下不容易发现。因此,这里开发了一个深度多图卷积神经网络来应对挑战。新方法不仅通过频谱 GCN 将基因的特征计算为自身及其邻居特征的加权平均值,而且还通过前馈神经网络 (FNN) 从基因特异性表达(或其他特征)分布中提取特征)。我们还提供了两个度量,即给定基因的重要性和每个基因网络的相对重要性评分,分别用于基因和基因网络对学习任务的贡献。为了评估新方法,我们使用多个乳腺癌和弥漫性大 B 细胞淋巴瘤数据集进行真实数据分析,并结合从“GIANT 2.0”中获得的多个基因网络与标准 FNN、GCN 和随机森林相比,
更新日期:2021-07-14
down
wechat
bug