当前位置: X-MOL 学术Comput. Math. Method Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Using Recursive Feature Selection with Random Forest to Improve Protein Structural Class Prediction for Low-Similarity Sequences
Computational and Mathematical Methods in Medicine ( IF 2.809 ) Pub Date : 2021-05-08 , DOI: 10.1155/2021/5529389
Yaoxin Wang 1 , Yingjie Xu 2 , Zhenyu Yang 1 , Xiaoqing Liu 3 , Qi Dai 1
Affiliation  

Many combinations of protein features are used to improve protein structural class prediction, but the information redundancy is often ignored. In order to select the important features with strong classification ability, we proposed a recursive feature selection with random forest to improve protein structural class prediction. We evaluated the proposed method with four experiments and compared it with the available competing prediction methods. The results indicate that the proposed feature selection method effectively improves the efficiency of protein structural class prediction. Only less than 5% features are used, but the prediction accuracy is improved by 4.6-13.3%. We further compared different protein features and found that the predicted secondary structural features achieve the best performance. This understanding can be used to design more powerful prediction methods for the protein structural class.

中文翻译:

使用带有随机森林的递归特征选择来改进低相似性序列的蛋白质结构类别预测

许多蛋白质特征的组合被用来改进蛋白质结构类别的预测,但信息冗余往往被忽略。为了选择具有较强分类能力的重要特征,我们提出了一种具有随机森林的递归特征选择,以改善蛋白质结构类别的预测。我们通过四个实验评估了所提出的方法,并将其与可用的竞争预测方法进行了比较。结果表明,所提出的特征选择方法有效地提高了蛋白质结构类别预测的效率。仅使用了不到 5% 的特征,但预测精度提高了 4.6-13.3%。我们进一步比较了不同的蛋白质特征,发现预测的二级结构特征达到了最佳性能。
更新日期:2021-05-08
down
wechat
bug