当前位置: X-MOL 学术Brief. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction
Briefings in Bioinformatics ( IF 6.8 ) Pub Date : 2021-08-20 , DOI: 10.1093/bib/bbab299
Yang Yang 1, 2 , Timothy M Walker 3 , Samaneh Kouchaki 4 , Chenyang Wang 1 , Timothy E A Peto 5 , Derrick W Crook 5, 6 , , David A Clifton 1, 2
Affiliation  

Abstract
Antimicrobial resistance (AMR) poses a threat to global public health. To mitigate the impacts of AMR, it is important to identify the molecular mechanisms of AMR and thereby determine optimal therapy as early as possible. Conventional machine learning-based drug-resistance analyses assume genetic variations to be homogeneous, thus not distinguishing between coding and intergenic sequences. In this study, we represent genetic data from Mycobacterium tuberculosis as a graph, and then adopt a deep graph learning method—heterogeneous graph attention network (‘HGAT–AMR’)—to predict anti-tuberculosis (TB) drug resistance. The HGAT–AMR model is able to accommodate incomplete phenotypic profiles, as well as provide ‘attention scores’ of genes and single nucleotide polymorphisms (SNPs) both at a population level and for individual samples. These scores encode the inputs, which the model is ‘paying attention to’ in making its drug resistance predictions. The results show that the proposed model generated the best area under the receiver operating characteristic (AUROC) for isoniazid and rifampicin (98.53 and 99.10%), the best sensitivity for three first-line drugs (94.91% for isoniazid, 96.60% for ethambutol and 90.63% for pyrazinamide), and maintained performance when the data were associated with incomplete phenotypes (i.e. for those isolates for which phenotypic data for some drugs were missing). We also demonstrate that the model successfully identifies genes and SNPs associated with drug resistance, mitigating the impact of resistance profile while considering particular drug resistance, which is consistent with domain knowledge.


中文翻译:

用于结核分枝杆菌耐药性预测的端到端异构图注意网络

摘要
抗菌素耐药性 (AMR) 对全球公共卫生构成威胁。为了减轻 AMR 的影响,重要的是要确定 AMR 的分子机制,从而尽早确定最佳治疗方案。传统的基于机器学习的耐药性分析假设遗传变异是同质的,因此不区分编码序列和基因间序列。在这项研究中,我们代表来自结核分枝杆菌的遗传数据作为图,然后采用深度图学习方法——异构图注意力网络('HGAT-AMR')——来预测抗结核病(TB)的耐药性。HGAT-AMR 模型能够适应不完整的表型谱,并在群体水平和个体样本中提供基因和单核苷酸多态性 (SNP) 的“注意力分数”。这些分数对输入进行编码,模型在进行耐药性预测时会“关注”这些输入。结果表明,所提出的模型对异烟肼和利福平(98.53%和99.10%)产生了最佳的受试者工作特征(AUROC)区域,对三种一线药物的敏感性最好(异烟肼94.91%,乙胺丁醇96.60%和90.63% 为吡嗪酰胺),当数据与不完整的表型相关时(即对于某些药物的表型数据缺失的分离株)保持性能。我们还证明,该模型成功地识别出与耐药性相关的基因和 SNP,在考虑特定耐药性的同时减轻耐药性的影响,这与领域知识是一致的。
更新日期:2021-08-20
down
wechat
bug