当前位置: X-MOL 学术IEEE/ACM Trans. Comput. Biol. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Protein Fold Recognition Based on Auto-Weighted Multi-View Graph Embedding Learning Model
IEEE/ACM Transactions on Computational Biology and Bioinformatics ( IF 4.5 ) Pub Date : 2020-04-29 , DOI: 10.1109/tcbb.2020.2991268
Ke Yan , Jie Wen , Yong Xu , Bin Liu

Protein fold recognition is critical for studies of the protein structure prediction and drug design. Several methods have been proposed to obtain discriminative features from the protein sequences for fold recognition. However, the ensemble methods that combine the various features to improve predictive performance remain the challenge problems. In this study, we proposed two novel algorithms: AWMG and EMfold. AWMG used a novel predictor based on the multi-view learning framework for fold recognition. Each view was treated as the intermediate representation of the corresponding data source of proteins, including the evolutionary information and the retrieval information. AWMG calculated the auto-weight for each view respectively and constructed the latent subspace which contains the common information shared by different views. The marginalized constraint was employed to enlarge the margins between different folds, improving the predictive performance of AWMG. Furthermore, we proposed a novel ensemble method called EMfold, which combines two complementary methods AWMG and DeepSS. The later method was a template-based algorithm using the SPARKS-X and DeepFR programs. EMfold integrated the advantages of template-based assignment and machine learning classifier. Experimental results on the two widely datasets (LE and YK) showed that the proposed methods outperformed some state-of-the-art methods, indicating that AWMG and EMfold are useful tools for protein fold recognition.

中文翻译:

基于自动加权多视图图嵌入学习模型的蛋白质折叠识别

蛋白质折叠识别对于蛋白质结构预测和药物设计的研究至关重要。已经提出了几种方法来从蛋白质序列中获得区分特征以进行折叠识别。然而,结合各种特征以提高预测性能的集成方法仍然是具有挑战性的问题。在这项研究中,我们提出了两种新算法:AWMG 和 EMfold。AWMG 使用基于多视图学习框架的新型预测器进行折叠识别。每个视图都被视为蛋白质相应数据源的中间表示,包括进化信息和检索信息。AWMG 分别计算每个视图的自动权重,并构建包含不同视图共享的公共信息的潜在子空间。边缘化约束用于扩大不同折叠之间的边距,提高 AWMG 的预测性能。此外,我们提出了一种新的集成方法,称为 EMfold,它结合了两种互补的方法 AWMG 和 DeepSS。后一种方法是使用 SPARKS-X 和 DeepFR 程序的基于模板的算法。EMfold 集成了基于模板的分配和机器学习分类器的优点。在两个广泛数据集(LE 和 YK)上的实验结果表明,所提出的方法优于一些最先进的方法,表明 AWMG 和 EMfold 是蛋白质折叠识别的有用工具。它结合了两种互补的方法 AWMG 和 DeepSS。后一种方法是使用 SPARKS-X 和 DeepFR 程序的基于模板的算法。EMfold 集成了基于模板的分配和机器学习分类器的优点。在两个广泛数据集(LE 和 YK)上的实验结果表明,所提出的方法优于一些最先进的方法,表明 AWMG 和 EMfold 是蛋白质折叠识别的有用工具。它结合了两种互补的方法 AWMG 和 DeepSS。后一种方法是使用 SPARKS-X 和 DeepFR 程序的基于模板的算法。EMfold 集成了基于模板的分配和机器学习分类器的优点。在两个广泛数据集(LE 和 YK)上的实验结果表明,所提出的方法优于一些最先进的方法,表明 AWMG 和 EMfold 是蛋白质折叠识别的有用工具。
更新日期:2020-04-29
down
wechat
bug