当前位置: X-MOL 学术Bioinformatics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
FUpred: detecting protein domains through deep-learning-based contact map prediction.
Bioinformatics ( IF 5.8 ) Pub Date : 2020-03-30 , DOI: 10.1093/bioinformatics/btaa217
Wei Zheng 1 , Xiaogen Zhou 1 , Qiqige Wuyun 2 , Robin Pearce 1 , Yang Li 1, 3 , Yang Zhang 1, 4
Affiliation  

Motivation
Protein domains are subunits that can fold and function independently. Correct domain boundary assignment is thus a critical step towards accurate protein structure and function analyses. There is, however, no efficient algorithm available for accurate domain prediction from sequence. The problem is particularly challenging for proteins with discontinuous domains, which consist of domain segments that are separated along the sequence.
Results
We developed a new algorithm, FUpred, which predicts protein domain boundaries utilizing contact maps created by deep residual neural networks coupled with co-evolutionary precision matrices. The core idea of the algorithm is to retrieve domain boundary locations by maximizing the number of intra-domain contacts, while minimizing the number of inter-domain contacts from the contact maps. FUpred was tested on a large-scale dataset consisting of 2,549 proteins and generated correct single- and multi-domain classifications with an MCC of 0.799, which was 19.1% (or 5.3%) higher than the best machine learning (or threading) based method. For proteins with discontinuous domains, the DBD (domain boundary detection) and NDO (normalized domain overlapping) scores of FUpred were 0.788 and 0.521, respectively, which were 17.3% and 23.8% higher than the best control method. The results demonstrate a new avenue to accurately detect domain composition from sequence alone, especially for discontinuous, multi-domain proteins.
Availability
https://zhanglab.ccmb.med.umich.edu/FUpred
Supplementary information
Supplementary dataSupplementary data are available at Bioinformatics online.


中文翻译:

FUpred:通过基于深度学习的联系图预测来检测蛋白质结构域。

动机
蛋白质结构域是可以独立折叠和发挥功能的亚基。因此,正确的域边界分配是实现准确的蛋白质结构和功能分析的关键步骤。但是,没有有效的算法可用于根据序列进行准确的域预测。对于具有不连续结构域的蛋白质而言,该问题特别具有挑战性,该结构域由沿序列分离的结构域片段组成。
结果
我们开发了一种新的算法FUpred,该算法利用深残差神经网络与协同进化精度矩阵结合创建的接触图来预测蛋白质域边界。该算法的核心思想是通过最大化域内联系人的数量来检索域边界位置,同时最小化联系人映射中的域间联系人的数量。FUpred在包含2549种蛋白质的大规模数据集上进行了测试,并生成正确的单域和多域分类,MCC为0.799,这比基于最佳机器学习(或线程)的方法高19.1%(或5.3%)。 。对于具有不连续域的蛋白质,FUpred的DBD(域边界检测)和NDO(归一化域重叠)评分分别为0.788和0.521,分别为17.3%和23。比最佳控制方法高8%。结果证明了一种新途径,可以准确地从单独序列中检测域组成,尤其是对于不连续的多域蛋白。
可用性
https://zhanglab.ccmb.med.umich.edu/FUpred
补充资料
补充数据补充数据可从Bioinformatics在线获得。
更新日期:2020-03-30
down
wechat
bug