当前位置: X-MOL 学术IEEE/ACM Trans. Comput. Biol. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CIPHER-SC: Disease-Gene Association Inference Using Graph Convolution on a Context-Aware Network With Single-Cell Data
IEEE/ACM Transactions on Computational Biology and Bioinformatics ( IF 4.5 ) Pub Date : 2020-08-18 , DOI: 10.1109/tcbb.2020.3017547
Yiding Zhang 1 , Lyujie Chen 2 , Shao Li 1
Affiliation  

Inference of disease-gene associations helps unravel the pathogenesis of diseases and contributes to the treatment. Although many machine learning-based methods have been developed to predict causative genes, accurate association inference remains challenging. One major reason is the inaccurate feature selection and accumulation of error brought by commonly used multi-stage training architecture. In addition, the existing methods do not incorporate cell-type-specific information, thus fail to study gene functions at a higher resolution. Therefore, we introduce single-cell transcriptome data and construct a context-aware network to unbiasedly integrate all data sources. Then we develop a graph convolution-based approach named CIPHER-SC to realize a complete end-to-end learning architecture. Our approach outperforms four state-of-the-art approaches in five-fold cross-validations on three distinct test sets with the best AUC of 0.9501, demonstrating its stable ability either to predict the novel genes or to predict with genetic basis. The ablation study shows that our complete end-to-end design and unbiased data integration boost the performance from 0.8727 to 0.9443 in AUC. The addition of single-cell data further improves the prediction accuracy and makes our results be enriched for cell-type-specific genes. These results confirm the ability of CIPHER-SC to discover reliable disease genes. Our implementation is available at http://github.com/YidingZhang117/CIPHER-SC .

中文翻译:

CIPHER-SC:在具有单细胞数据的上下文感知网络上使用图卷积进行疾病基因关联推理

疾病基因关联的推断有助于揭示疾病的发病机制并有助于治疗。尽管已经开发了许多基于机器学习的方法来预测致病基因,但准确的关联推理仍然具有挑战性。一个主要原因是常用的多阶段训练架构带来的特征选择不准确和错误的积累。此外,现有方法不包含细胞类型特异性信息,因此无法以更高分辨率研究基因功能。因此,我们引入了单细胞转录组数据并构建了一个上下文感知网络来公正地整合所有数据源。然后我们开发了一种名为 CIPHER-SC 的基于图卷积的方法来实现完整的端到端学习架构。我们的方法在三个不同的测试集上的五重交叉验证中优于四种最先进的方法,最佳 AUC 为 0.9501,证明了其预测新基因或预测遗传基础的稳定能力。消融研究表明,我们完整的端到端设计和无偏数据集成将 AUC 的性能从 0.8727 提高到 0.9443。单细胞数据的添加进一步提高了预测准确性,并使我们的结果丰富了细胞类型特异性基因。这些结果证实了 CIPHER-SC 发现可靠疾病基因的能力。我们的实施可在 消融研究表明,我们完整的端到端设计和无偏数据集成将 AUC 的性能从 0.8727 提高到 0.9443。单细胞数据的添加进一步提高了预测准确性,并使我们的结果丰富了细胞类型特异性基因。这些结果证实了 CIPHER-SC 发现可靠疾病基因的能力。我们的实施可在 消融研究表明,我们完整的端到端设计和无偏数据集成将 AUC 的性能从 0.8727 提高到 0.9443。单细胞数据的添加进一步提高了预测准确性,并使我们的结果丰富了细胞类型特异性基因。这些结果证实了 CIPHER-SC 发现可靠疾病基因的能力。我们的实施可在http://github.com/YidingZhang117/CIPHER-SC .
更新日期:2020-08-18
down
wechat
bug