当前位置: X-MOL 学术Bioinformatics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dual-Dropout Graph Convolutional Network for Predicting Synthetic Lethality in Human Cancers
Bioinformatics ( IF 5.8 ) Pub Date : 2020-03-28 , DOI: 10.1093/bioinformatics/btaa211
Ruichu Cai 1 , Xuexin Chen 1 , Yuan Fang 2 , Min Wu 3 , Yuexing Hao 4
Affiliation  

Motivation
Synthetic lethality (SL) is a promising form of gene interaction for cancer therapy, as it is able to identify specific genes to target at cancer cells without disrupting normal cells. As high-throughput wet-lab settings are often costly and face various challenges, computational approaches have become a practical complement. In particular, predicting SLs can be formulated as a link prediction task on a graph of interacting genes. Although matrix factorization techniques have been widely adopted in link prediction, they focus on mapping genes to latent representations in isolation, without aggregating information from neighboring genes. Graph convolutional networks (GCN) can capture such neighborhood dependency in a graph. However, it is still challenging to apply GCN for SL prediction as SL interactions are extremely sparse, which is more likely to cause overfitting.
Results
In this paper, we propose a novel Dual-Dropout GCN (DDGCN) for learning more robust gene representations for SL prediction. We employ both coarse-grained node dropout and fine-grained edge dropout to address the issue that standard dropout in vanilla GCN is often inadequate in reducing overfitting on sparse graphs. In particular, coarse-grained node dropout can efficiently and systematically enforce dropout at the node (gene) level, while fine-grained edge dropout can further fine-tune the dropout at the interaction (edge) level. We further present a theoretical framework to justify our model architecture. Finally, we conduct extensive experiments on human SL datasets and the results demonstrate the superior performance of our model in comparison with state-of-the-art methods.
Availability
DDGCN is implemented in python 3.7, open-source and freely available at https://github.com/CXX1113/Dual-DropoutGCN


中文翻译:

双滴图卷积网络预测人类癌症的合成致死率

动机
合成杀伤力(SL)是用于癌症治疗的一种有希望的基因相互作用形式,因为它能够识别靶向癌细胞的特定基因而不会破坏正常细胞。由于高通量湿实验室设置通常价格昂贵并且面临各种挑战,因此计算方法已成为一种实用的补充。特别地,可以将预测SL公式化为相互作用基因图上的链接预测任务。尽管矩阵分解技术已广泛用于链接预测中,但它们专注于将基因映射到孤立的潜在表示中,而不会汇总相邻基因的信息。图卷积网络(GCN)可以捕获图中的这种邻域依赖性。但是,由于SL交互非常稀疏,因此将GCN应用于SL预测仍然是一项挑战,
结果
在本文中,我们提出了一种新颖的双脱GCN(DDGCN),用于学习用于SL预测的更可靠的基因表示。我们同时使用了粗粒度节点缺失和细粒度边缘缺失来解决以下问题:香草GCN中的标准缺失通常不足以减少稀疏图中的过度拟合。特别是,粗粒度的节点丢失可以有效而系统地在节点(基因)级别强制丢失,而细粒度的边缘丢失可以进一步在交互(边缘)级别微调丢失。我们进一步提出了一个理论框架来证明我们的模型架构是合理的。最后,我们对人类SL数据集进行了广泛的实验,结果证明了我们的模型与最新技术方法相比具有优越的性能。
可用性
DDGCN在python 3.7中实现,是开源的,可从https://github.com/CXX1113/Dual-DropoutGCN免费获得
更新日期:2020-03-28
down
wechat
bug