当前位置: X-MOL 学术 › Proc Natl Acad Sci U S A › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Screening membraneless organelle participants with machine-learning models that integrate multimodal features.
Proceedings of the National Academy of Sciences of the United States of America Pub Date : 2022-06-10 , DOI: 10.1073/pnas.2115369119
Zhaoming Chen 1 , Chao Hou 1 , Liang Wang 2 , Chunyu Yu 1, 3 , Taoyu Chen 1 , Boyan Shen 1 , Yaoyao Hou 4 , Pilong Li 2 , Tingting Li 1
Affiliation  

Protein self-assembly is one of the formation mechanisms of biomolecular condensates. However, most phase-separating systems (PS) demand multiple partners in biological conditions. In this study, we divided PS proteins into two groups according to the mechanism by which they undergo PS: PS-Self proteins can self-assemble spontaneously to form droplets, while PS-Part proteins interact with partners to undergo PS. Analysis of the amino acid composition revealed differences in the sequence pattern between the two protein groups. Existing PS predictors, when evaluated on two test protein sets, preferentially predicted self-assembling proteins. Thus, a comprehensive predictor is required. Herein, we propose that properties other than sequence composition can provide crucial information in screening PS proteins. By incorporating phosphorylation frequencies and immunofluorescence image-based droplet-forming propensity with other PS-related features, we built two independent machine-learning models to separately predict the two protein categories. Results of independent testing suggested the superiority of integrating multimodal features. We performed experimental verification on the top-scored proteins DHX9, Ki-67, and NIFK. Their PS behavior in vitro revealed the effectiveness of our models in PS prediction. Further validation on the proteome of membraneless organelles confirmed the ability of our models to identify PS-Part proteins. We implemented a web server named PhaSePred (http://predict.phasep.pro/) that incorporates our two models together with representative PS predictors. PhaSePred displays proteome-level quantiles of different features, thus profiling PS propensity and providing crucial information for identification of candidate proteins.

中文翻译:

使用集成了多模式特征的机器学习模型筛选无膜细胞器参与者。

蛋白质自组装是生物分子缩合物的形成机制之一。然而,大多数相分离系统 (PS) 在生物条件下需要多个合作伙伴。在这项研究中,我们根据 PS 蛋白发生 PS 的机制将其分为两组:PS-Self 蛋白可以自发自组装形成液滴,而 PS-Part 蛋白与伴侣相互作用发生 PS。氨基酸组成分析揭示了两个蛋白质组之间序列模式的差异。现有的 PS 预测因子在对两个测试蛋白集进行评估时,优先预测自组装蛋白。因此,需要一个全面的预测器。在此,我们提出序列组成以外的特性可以提供筛选 PS 蛋白的关键信息。通过将磷酸化频率和基于免疫荧光图像的液滴形成倾向与其他 PS 相关特征相结合,我们建立了两个独立的机器学习模型来分别预测这两个蛋白质类别。独立测试的结果表明集成多模式特征的优越性。我们对得分最高的蛋白质 DHX9、Ki-67 和 NIFK 进行了实验验证。他们的体外 PS 行为揭示了我们的模型在 PS 预测中的有效性。对无膜细胞器蛋白质组的进一步验证证实了我们的模型识别 PS-Part 蛋白的能力。我们实现了一个名为 PhaSePred (http://predict.phasep.pro/) 的网络服务器,它将我们的两个模型与代表性的 PS 预测器结合在一起。PhaSePred 显示不同特征的蛋白质组水平分位数,
更新日期:2022-06-10
down
wechat
bug