当前位置: X-MOL 学术Comput. Struct. Biotechnol. J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient and accurate identification of protein complexes from protein-protein interaction networks based on the clustering coefficient
Computational and Structural Biotechnology Journal ( IF 4.4 ) Pub Date : 2021-09-20 , DOI: 10.1016/j.csbj.2021.09.014
Sara Omranian 1, 2 , Angela Angeleska 3 , Zoran Nikoloski 1, 2
Affiliation  

Identification of protein complexes from protein-protein interaction (PPI) networks is a key problem in PPI mining, solved by parameter-dependent approaches that suffer from small recall rates. Here we introduce GCC-v, a family of efficient, parameter-free algorithms to accurately predict protein complexes using the (weighted) clustering coefficient of proteins in PPI networks. Through comparative analyses with gold standards and PPI networks from , , and , we demonstrate that GCC-v outperforms twelve state-of-the-art approaches for identification of protein complexes with respect to twelve performance measures in at least 85.71% of scenarios. We also show that GCC-v results in the exact recovery of ∼35% of protein complexes in a pan-plant PPI network and discover 144 new protein complexes in , with high support from GO semantic similarity. Our results indicate that findings from GCC-v are robust to network perturbations, which has direct implications to assess the impact of the PPI network quality on the predicted protein complexes.

中文翻译:


基于聚类系数从蛋白质-蛋白质相互作用网络中高效准确地识别蛋白质复合物



从蛋白质-蛋白质相互作用 (PPI) 网络中识别蛋白质复合物是 PPI 挖掘中的一个关键问题,该问题通过召回率较低的参数依赖方法来解决。在这里,我们介绍 GCC-v,这是一系列高效、无参数的算法,可使用 PPI 网络中蛋白质的(加权)聚类系数准确预测蛋白质复合物。通过与来自 、 、 和 的黄金标准和 PPI 网络的比较分析,我们证明 GCC-v 在至少 85.71% 的场景中,在 12 项性能指标方面优于 12 种最先进的蛋白质复合物识别方法。我们还表明,GCC-v 可以精确回收泛植物 PPI 网络中约 35% 的蛋白质复合物,并在 GO 语义相似性的高度支持下发现了 144 个新的蛋白质复合物。我们的结果表明,GCC-v 的研究结果对网络扰动具有稳健性,这对于评估 PPI 网络质量对预测蛋白质复合物的影响具有直接意义。
更新日期:2021-09-20
down
wechat
bug