当前位置: X-MOL 学术BMC Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
iProDNA-CapsNet: identifying protein-DNA binding residues using capsule neural networks.
BMC Bioinformatics ( IF 3 ) Pub Date : 2019-12-27 , DOI: 10.1186/s12859-019-3295-2
Binh P Nguyen 1 , Quang H Nguyen 2 , Giang-Nam Doan-Ngoc 2 , Thanh-Hoang Nguyen-Vo 1 , Susanto Rahardja 3
Affiliation  

BACKGROUND Since protein-DNA interactions are highly essential to diverse biological events, accurately positioning the location of the DNA-binding residues is necessary. This biological issue, however, is currently a challenging task in the age of post-genomic where data on protein sequences have expanded very fast. In this study, we propose iProDNA-CapsNet - a new prediction model identifying protein-DNA binding residues using an ensemble of capsule neural networks (CapsNets) on position specific scoring matrix (PSMM) profiles. The use of CapsNets promises an innovative approach to determine the location of DNA-binding residues. In this study, the benchmark datasets introduced by Hu et al. (2017), i.e., PDNA-543 and PDNA-TEST, were used to train and evaluate the model, respectively. To fairly assess the model performance, comparative analysis between iProDNA-CapsNet and existing state-of-the-art methods was done. RESULTS Under the decision threshold corresponding to false positive rate (FPR) ≈ 5%, the accuracy, sensitivity, precision, and Matthews's correlation coefficient (MCC) of our model is increased by about 2.0%, 2.0%, 14.0%, and 5.0% with respect to TargetDNA (Hu et al., 2017) and 1.0%, 75.0%, 45.0%, and 77.0% with respect to BindN+ (Wang et al., 2010), respectively. With regards to other methods not reporting their threshold settings, iProDNA-CapsNet also shows a significant improvement in performance based on most of the evaluation metrics. Even with different patterns of change among the models, iProDNA-CapsNets remains to be the best model having top performance in most of the metrics, especially MCC which is boosted from about 8.0% to 220.0%. CONCLUSIONS According to all evaluation metrics under various decision thresholds, iProDNA-CapsNet shows better performance compared to the two current best models (BindN and TargetDNA). Our proposed approach also shows that CapsNet can potentially be used and adopted in other biological applications.

中文翻译:

iProDNA-CapsNet:使用胶囊神经网络识别蛋白质-DNA 结合残基。

背景由于蛋白质-DNA相互作用对于多种生物事件非常重要,因此准确定位DNA结合残基的位置是必要的。然而,在后基因组时代,这个生物学问题目前是一项具有挑战性的任务,在这个时代,蛋白质序列的数据增长得非常快。在这项研究中,我们提出了 iProDNA-CapsNet - 一种新的预测模型,使用位置特定评分矩阵 (PSMM) 上的胶囊神经网络 (CapsNets) 集合来识别蛋白质-DNA 结合残基。CapsNets 的使用有望提供一种确定 DNA 结合残基位置的创新方法。在本研究中,Hu 等人引入的基准数据集。(2017),即PDNA-543和PDNA-TEST,分别用于训练和评估模型。为了公平评估模型性能,对 iProDNA-CapsNet 和现有最先进方法进行了比较分析。结果在假阳性率(FPR)≈5%对应的决策阈值下,模型的准确率、灵敏度、精密度和马修斯相关系数(MCC)分别提高了约2.0%、2.0%、14.0%和5.0%相对于 TargetDNA(Hu 等人,2017),相对于 BindN+(Wang 等人,2010)分别为 1.0%、75.0%、45.0% 和 77.0%。对于未报告阈值设置的其他方法,iProDNA-CapsNet 也根据大多数评估指标显示出性能的显着提高。即使模型之间的变化模式不同,iProDNA-CapsNets 仍然是在大多数指标中具有顶级性能的最佳模型,尤其是 MCC,它从约 8.0% 提升到 220.0%。结论根据各种决策阈值下的所有评估指标,iProDNA-CapsNet 与当前两个最佳模型(BindN 和 TargetDNA)相比显示出更好的性能。我们提出的方法还表明 CapsNet 可以在其他生物应用中使用和采用。
更新日期:2019-12-27
down
wechat
bug