当前位置: X-MOL 学术BMC Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
MATHLA: a robust framework for HLA-peptide binding prediction integrating bidirectional LSTM and multiple head attention mechanism
BMC Bioinformatics ( IF 2.9 ) Pub Date : 2021-01-06 , DOI: 10.1186/s12859-020-03946-z
Yilin Ye , Jian Wang , Yunwan Xu , Yi Wang , Youdong Pan , Qi Song , Xing Liu , Ji Wan

Accurate prediction of binding between class I human leukocyte antigen (HLA) and neoepitope is critical for target identification within personalized T-cell based immunotherapy. Many recent prediction tools developed upon the deep learning algorithms and mass spectrometry data have indeed showed improvement on the average predicting power for class I HLA-peptide interaction. However, their prediction performances show great variability over individual HLA alleles and peptides with different lengths, which is particularly the case for HLA-C alleles due to the limited amount of experimental data. To meet the increasing demand for attaining the most accurate HLA-peptide binding prediction for individual patient in the real-world clinical studies, more advanced deep learning framework with higher prediction accuracy for HLA-C alleles and longer peptides is highly desirable. We present a pan-allele HLA-peptide binding prediction framework—MATHLA which integrates bi-directional long short-term memory network and multiple head attention mechanism. This model achieves better prediction accuracy in both fivefold cross-validation test and independent test dataset. In addition, this model is superior over existing tools regarding to the prediction accuracy for longer ligand ranging from 11 to 15 amino acids. Moreover, our model also shows a significant improvement for HLA-C-peptide-binding prediction. By investigating multiple-head attention weight scores, we depicted possible interaction patterns between three HLA I supergroups and their cognate peptides. Our method demonstrates the necessity of further development of deep learning algorithm in improving and interpreting HLA-peptide binding prediction in parallel to increasing the amount of high-quality HLA ligandome data.

中文翻译:

MATHLA:结合双向LSTM和多头注意力机制的HLA-肽结合预测的强大框架

准确预测I类人类白细胞抗原(HLA)与新表位之间的结合对于基于个性化T细胞的免疫疗法中的靶标识别至关重要。实际上,根据深度学习算法和质谱数据开发的许多最新预测工具的确显示出对I类HLA-肽相互作用的平均预测能力的改进。然而,他们的预测性能显示了其在不同长度的单个HLA等位基因和多肽上的巨大变异性,由于有限的实验数据,HLA-C等位基因尤其如此。为了满足在现实世界的临床研究中对单个患者获得最准确的HLA肽结合预测的不断增长的需求,对于HLA-C等位基因和更长的肽具有更高的预测准确性的更高级的深度学习框架是非常需要的。我们提出了一个泛等位基因HLA-肽结合预测框架—MATHLA,它集成了双向长短期记忆网络和多头注意力机制。该模型在五重交叉验证测试和独立测试数据集中均实现了更好的预测准确性。此外,该模型在11至15个氨基酸范围内的更长配体的预测准确性方面优于现有工具。此外,我们的模型还显示出HLA-C-肽结合预测的显着改善。通过研究多头注意力权重得分,我们描述了三个HLA I超群与其同源肽之间可能的相互作用模式。
更新日期:2021-01-07
down
wechat
bug