当前位置: X-MOL 学术Interdiscip. Sci. Comput. Life Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Combining SVM and ECOC for Identification of Protein Complexes from Protein Protein Interaction Networks by Integrating Amino Acids' Physical Properties and Complex Topology.
Interdisciplinary Sciences: Computational Life Sciences ( IF 3.9 ) Pub Date : 2020-05-21 , DOI: 10.1007/s12539-020-00369-5
Amen Faridoon 1 , Aisha Sikandar 1 , Muhammad Imran 2 , Saman Ghouri 1 , Misba Sikandar 3 , Waseem Sikandar 4
Affiliation  

Protein Complexes plays important role in key functional processes in cells by forming Protein Protein Interaction (PPI) networks. Conventionally, they were determined through experimental approaches. For the sake of saving time and cost reduction, many computational methods have been proposed. Fewer computational approaches take into account significant biological information contained within protein amino acid sequence and identified dense sub graphs as complexes from PPI network by considering density and degree statistics. Biological information evaluate the common features for performing a particular biological function among two proteins. Moreover, linear, star and hybrid sub graph structures may be found in PPI network so other topological features of graph are also important. In this article, support vector machine (SVM) in combination with Error-correcting output coding (ECOC) algorithm is utilized to construct an automatic detector for mining multiple protein complexes from PPI network, where amino acid physical properties i.e. kidera factors and a variety of topological constrains are employed as feature vectors. The overall success rates of protein complex identification achieved are 88.6% and 76.0% on MIPS benchmark set by considering DIP and Gavin interactions respectively. Support vector machine was an effective and solid approach for complex detection with amino acid's physical properties and complex topology as dimensional vectors. Error-correcting output coding (ECOC) algorithm is a powerful algorithm for mining multiple protein complexes of small as well as large sizes. The accuracy of complex identification task based on amino acid's physical and complex topological characteristics are strikingly increase when ECOC is integrated with SVM approach. Moreover, this paper implies that ECOC algorithm may succeed over a wide range of applications in biological data mining.

中文翻译:

通过整合氨基酸的物理性质和复杂的拓扑结构,将SVM和ECOC结合用于从蛋白质相互作用网络中鉴定蛋白质复合物。

蛋白质复合物通过形成蛋白质蛋白质相互作用(PPI)网络在细胞的关键功能过程中发挥重要作用。按照惯例,它们是通过实验方法确定的。为了节省时间和降低成本,已经提出了许多计算方法。较少的计算方法考虑到蛋白质氨基酸序列中包含的重要生物学信息,并通过考虑密度和程度统计,将密集的子图识别为来自PPI网络的复合物。生物学信息评估了在两种蛋白质之间执行特定生物学功能的共同特征。此外,线性,星形和混合子图结构可以在PPI网络中找到,因此图的其他拓扑特征也很重要。在这篇文章中,支持向量机(SVM)与纠错输出编码(ECOC)算法结合使用,构建了一种自动检测器,用于从PPI网络中挖掘多种蛋白质复合物,其中使用了氨基酸的物理特性,例如,基迪亚因子和多种拓扑约束作为特征向量。通过分别考虑DIP和Gavin相互作用,在MIPS基准上确定的蛋白质复合物鉴定的总体成功率为88.6%和76.0%。支持向量机是一种以氨基酸的物理特性和复杂拓扑作为维数向量进行复杂检测的有效而可靠的方法。纠错输出编码(ECOC)算法是一种强大的算法,可用于挖掘各种大小的蛋白质复合物。当ECOC与SVM方法集成时,基于氨基酸的物理和复杂拓扑特征的复杂识别任务的准确性显着提高。而且,本文暗示ECOC算法可能会在生物数据挖掘中的广泛应用中取得成功。
更新日期:2020-05-21
down
wechat
bug