当前位置: X-MOL 学术IEEE J. Biomed. Health Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
MultiPredGO: Deep Multi-Modal Protein Function Prediction by Amalgamating Protein Structure, Sequence, and Interaction Information
IEEE Journal of Biomedical and Health Informatics ( IF 6.7 ) Pub Date : 2020-09-08 , DOI: 10.1109/jbhi.2020.3022806
Swagarika Jaharlal Giri , Pratik Dutta , Parth Halani , Sriparna Saha

Protein is an essential macro-nutrient for perceiving a wide range of biochemical activities and biological regulations in living cells. In this work, we have presented a novel multi-modal approach, named MultiPredGO, for predicting protein functions by utilizing two different kinds of information, namely protein sequence and the protein secondary structure. Here, our contributions are threefold; firstly, along with the protein sequence, we learn the feature representation from the protein structure. Secondly, we develop two different deep learning models after considering the characteristics of the underlying data patterns of the protein sequence and protein 3D structures. Finally, along with these two modalities, we have also utilized protein interaction information for expediting the efficiency of the proposed model in predicting the protein functions. For extracting features from different modalities, we have utilized various variations of the convolutional neural network. As the protein function classes are dependent on each other, we have used a neuro-symbolic hierarchical classification model, which resembles the structure of Gene Ontology (GO), for effectively predicting the dependent protein functions. Finally, to validate the goodness of our proposed method (MultiPredGO), we have compared our results with various uni-modal along with two well-known multi-modal protein function prediction approaches, namely, INGA and DeepGO. Results show that the overall performance of the proposed approach in terms of accuracy, F-measure, precision, and recall metrics are better than those by the state-of-the-art methods. MultiPredGO attains an average 13.05% and 30.87% improvements over the best existing comparing approach (DeepGO) for cellular component and molecular functions, respectively.

中文翻译:


MultiPredGO:通过合并蛋白质结构、序列和相互作用信息进行深度多模式蛋白质功能预测



蛋白质是感知活细胞中广泛的生化活动和生物调节的重要宏观营养素。在这项工作中,我们提出了一种新颖的多模式方法,称为 MultiPredGO,通过利用两种不同类型的信息(即蛋白质序列和蛋白质二级结构)来预测蛋白质功能。在这里,我们的贡献是三重的:首先,随着蛋白质序列,我们从蛋白质结构中学习特征表示。其次,在考虑蛋白质序列和蛋白质3D结构的底层数据模式的特征后,我们开发了两种不同的深度学习模型。最后,除了这两种模式之外,我们还利用蛋白质相互作用信息来加快所提出的模型预测蛋白质功能的效率。为了从不同模态中提取特征,我们利用了卷积神经网络的各种变体。由于蛋白质功能类别相互依赖,我们使用了类似于基因本体(GO)结构的神经符号分层分类模型,来有效预测依赖的蛋白质功能。最后,为了验证我们提出的方法(MultiPredGO)的优点,我们将我们的结果与各种单模态以及两种众所周知的多模态蛋白质功能预测方法(即 INGA 和 DeepGO)进行了比较。结果表明,该方法在准确度、F 测量、精确度和召回率指标方面的整体性能优于最先进的方法。 MultiPredGO 的平均得分为 13.05% 和 30。与现有的最佳细胞成分和分子功能比较方法 (DeepGO) 相比,分别提高了 87%。
更新日期:2020-09-08
down
wechat
bug