当前位置: X-MOL 学术Hum. Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A novel method to predict essential proteins based on tensor and HITS algorithm.
Human Genomics ( IF 3.8 ) Pub Date : 2020-04-06 , DOI: 10.1186/s40246-020-00263-7
Zhihong Zhang 1 , Yingchun Luo 1, 2 , Sai Hu 1 , Xueyong Li 1 , Lei Wang 1 , Bihai Zhao 1, 3
Affiliation  

Essential proteins are an important part of the cell and closely related to the life activities of the cell. Hitherto, Protein-Protein Interaction (PPI) networks have been adopted by many computational methods to predict essential proteins. Most of the current approaches focus mainly on the topological structure of PPI networks. However, those methods relying solely on the PPI network have low detection accuracy for essential proteins. Therefore, it is necessary to integrate the PPI network with other biological information to identify essential proteins. In this paper, we proposed a novel random walk method for identifying essential proteins, called HEPT. A three-dimensional tensor is constructed first by combining the PPI network of Saccharomyces cerevisiae with multiple biological data such as gene ontology annotations and protein domains. Then, based on the newly constructed tensor, we extended the Hyperlink-Induced Topic Search (HITS) algorithm from a two-dimensional to a three-dimensional tensor model that can be utilized to infer essential proteins. Different from existing state-of-the-art methods, the importance of proteins and the types of interactions will both contribute to the essential protein prediction. To evaluate the performance of our newly proposed HEPT method, proteins are ranked in the descending order based on their ranking scores computed by our method and other competitive methods. After that, a certain number of the ranked proteins are selected as candidates for essential proteins. According to the list of known essential proteins, the number of true essential proteins is used to judge the performance of each method. Experimental results show that our method can achieve better prediction performance in comparison with other nine state-of-the-art methods in identifying essential proteins. Through analysis and experimental results, it is obvious that HEPT can be used to effectively improve the prediction accuracy of essential proteins by the use of HITS algorithm and the combination of network topology with gene ontology annotations and protein domains, which provides a new insight into multi-data source fusion.

中文翻译:


一种基于张量和HITS算法的必需蛋白质预测新方法。



必需蛋白质是细胞的重要组成部分,与细胞的生命活动密切相关。迄今为止,蛋白质-蛋白质相互作用(PPI)网络已被许多计算方法用来预测必需蛋白质。当前大多数方法主要关注 PPI 网络的拓扑结构。然而,那些仅仅依赖于PPI网络的方法对必需蛋白质的检测精度较低。因此,有必要将PPI网络与其他生物信息相结合来识别必需蛋白质。在本文中,我们提出了一种新颖的随机游走方法来识别必需蛋白质,称为 HEPT。首先将酿酒酵母的PPI网络与基因本体注释和蛋白质结构域等多种生物数据相结合,构建三维张量。然后,基于新构建的张量,我们将超链接诱导主题搜索(HITS)算法从二维扩展为三维张量模型,可用于推断必需蛋白质。与现有的最先进方法不同,蛋白质的重要​​性和相互作用的类型都将有助于基本蛋白质的预测。为了评估我们新提出的 HEPT 方法的性能,根据我们的方法和其他竞争方法计算的排名分数按降序对蛋白质进行排名。之后,选择一定数量的排序蛋白质作为必需蛋白质的候选蛋白质。根据已知必需蛋白质列表,使用真正必需蛋白质的数量来判断每种方法的性能。 实验结果表明,与其他九种最先进的方法相比,我们的方法在识别必需蛋白质方面可以实现更好的预测性能。通过分析和实验结果表明,HEPT利用HITS算法以及网络拓扑与基因本体注释和蛋白质域的结合,可以有效提高必需蛋白质的预测精度,为多种蛋白质的预测提供了新的视角。 -数据源融合。
更新日期:2020-04-22
down
wechat
bug