当前位置: X-MOL 学术Proteins Struct. Funct. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Enabling full-length evolutionary profiles based deep convolutional neural network for predicting DNA-binding proteins from sequence.
Proteins: Structure, Function, and Bioinformatics ( IF 3.2 ) Pub Date : 2019-07-08 , DOI: 10.1002/prot.25763
Sucheta Chauhan 1 , Shandar Ahmad 1
Affiliation  

Sequence based DNA-binding protein (DBP) prediction is a widely studied biological problem. Sliding windows on position specific substitution matrices (PSSMs) rows predict DNA-binding residues well on known DBPs but the same models cannot be applied to unequally sized protein sequences. PSSM summaries representing column averages and their amino-acid wise versions have been effectively used for the task, but it remains unclear if these features carry all the PSSM's predictive power, traditionally harnessed for binding site predictions. Here we evaluate if PSSMs scaled up to a fixed size by zero-vector padding (pPSSM) could perform better than the summary based features on similar models. Using multilayer perceptron (MLP) and deep convolutional neural network (CNN), we found that (a) Summary features work well for single-genome (human-only) data but are outperformed by pPSSM for diverse PDB-derived data sets, suggesting greater summary-level redundancy in the former, (b) even when summary features work comparably well with pPSSM, a consensus on the two outperforms both of them (c) CNN models comprehensively outperform their corresponding MLP models and (d) actual predicted scores from different models depend on the choice of input feature sets used whereas overall performance levels are model-dependent in which CNN leads the accuracy.

中文翻译:

基于深度进化卷积神经网络的全长进化谱可用于从序列中预测DNA结合蛋白。

基于序列的DNA结合蛋白(DBP)预测是一个广泛研究的生物学问题。位置特定替换矩阵(PSSM)行上的滑动窗口可以很好地预测已知DBP上的DNA结合残基,但是相同的模型不能应用于大小不等的蛋白质序列。代表列平均值的PSSM摘要及其氨基酸形式已有效地用于此任务,但尚不清楚这些功能是否具有传统上用于结合位点预测的所有PSSM的预测能力。在这里,我们评估了通过零向量填充(pPSSM)将PSSM缩放到固定大小的性能是否比类似模型上基于摘要的功能更好。使用多层感知器(MLP)和深度卷积神经网络(CNN),
更新日期:2019-12-09
down
wechat
bug