当前位置: X-MOL 学术bioRxiv. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Integrating multimeric threading with high-throughput experiments for structural interactome of Escherichia coli
bioRxiv - Bioinformatics Pub Date : 2020-10-18 , DOI: 10.1101/2020.10.17.343962
Weikang Gong , Aysam Guerler , Chengxin Zhang , Elisa Warner , Chunhua Li , Yang Zhang

Genome-wide protein-protein interaction (PPI) determination remains a significant unsolved problem in structural biology. The difficulty is twofold since high-throughput experiments (HTEs) have often a high false-positive rate in assigning PPIs, and PPI quaternary structures are more difficult to solve than tertiary structures using traditional structural biology techniques. We proposed a uniform pipeline to address both problems, which first recognizes PPIs by combining multi-chain threading alignments with HTE results using naive Bayesian classifiers, where the quaternary complex structures are then constructed by mapping the monomer models with the dimeric threading frameworks through interface-specific structural alignments. The pipeline was applied to the Escherichia coli genome and created 35,125 confident PPIs which is 4.5-fold higher than HTE alone. Graphic analyses of the PPI networks revealed a scale-free cluster size distribution, which was found critical to the robustness of genome evolution and the centrality of functionally important proteins that are essential to E. coli survival. Furthermore, complex structure models were constructed for all predicted E. coli PPIs based on the quaternary threading alignments, where 6,771 of them were found to have a high confidence score that corresponds to the correct fold of the complexes with a TM-score >0.5 and 93 showed a close consistency with the later released experimental structures with an average TM-score=0.73. These results demonstrated the significant usefulness of threading-based homologous modeling in both genome-wide PPI network detection and complex structural construction.

中文翻译:

将多聚体线程与高通量实验整合用于大肠杆菌的结构相互作用组

全基因组蛋白间相互作用(PPI)的确定仍然是结构生物学中尚未解决的重要问题。困难是双重的,因为高通量实验(HTE)在分配PPI时通常具有很高的假阳性率,并且使用传统的结构生物学技术,与三级结构相比,PPI的四级结构更难解决。我们提出了一个统一的管道来解决这两个问题,该管道首先通过使用朴素贝叶斯分类器将多链线程比对与HTE结果相结合来识别PPI,然后通过通过接口与二聚线程框架映射单体模型来构建四元复杂结构。具体的结构排列。该管道已应用于大肠杆菌基因组,并创建了35,125个可信PPI,即4。比单独的HTE高5倍。PPI网络的图形分析显示了无标度的簇大小分布,发现该分布对基因组进化的鲁棒性和对大肠杆菌生存至关重要的功能重要蛋白的中心性至关重要。此外,基于四级螺纹比对,针对所有预测的大肠杆菌PPI构建了复杂的结构模型,其中发现6,771个具有较高的置信度得分,与TM得分> 0.5和0.5的复合物的正确倍数相对应。 93显示了与后来发布的实验结构的紧密一致性,平均TM分数= 0.73。这些结果证明了基于线程的同源建模在全基因组PPI网络检测和复杂结构构建中的巨大作用。PPI网络的图形分析显示了无标度的簇大小分布,发现该分布对基因组进化的鲁棒性和对大肠杆菌生存至关重要的功能重要蛋白的中心性至关重要。此外,基于四级螺纹比对,针对所有预测的大肠杆菌PPI构建了复杂的结构模型,其中发现6,771个具有较高的置信度得分,与TM得分> 0.5和0.5的复合物的正确倍数相对应。 93显示了与后来发布的实验结构的紧密一致性,平均TM分数= 0.73。这些结果证明了基于线程的同源建模在全基因组PPI网络检测和复杂结构构建中的巨大作用。PPI网络的图形分析显示了无标度的簇大小分布,发现该分布对基因组进化的鲁棒性和对大肠杆菌生存至关重要的功能重要蛋白的中心性至关重要。此外,基于四级螺纹比对,针对所有预测的大肠杆菌PPI构建了复杂的结构模型,其中发现6,771个具有较高的置信度得分,与TM得分> 0.5和0.5的复合物的正确倍数相对应。 93显示了与后来发布的实验结构的紧密一致性,平均TM分数= 0.73。这些结果证明了基于线程的同源建模在全基因组PPI网络检测和复杂结构构建中的巨大作用。
更新日期:2020-10-19
down
wechat
bug