当前位置: X-MOL 学术Science › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Protein storytelling through physics
Science ( IF 56.9 ) Pub Date : 2020-11-26 , DOI: 10.1126/science.aaz3041
Emiliano Brini 1 , Carlos Simmerling 1, 2 , Ken Dill 1, 2, 3
Affiliation  

Understanding what drives proteins Computational molecular physics (CMP) aims to leverage the laws of physics to understand not just static structures but also the motions and actions of biomolecules. Applying CMP to proteins has required either simplifying the physical models or running simulations that are shorter than the time scale of the biological activity. Brini et al. reviewed advances that are moving CMP to time scales that match biological events such as protein folding, ligand unbinding, and some conformational changes. They also highlight the role of blind competitions in driving the field forward. New methods such as deep learning approaches are likely to make CMP an increasingly powerful tool in describing proteins in action. Science, this issue p. eaaz3041 BACKGROUND Understanding biology, particularly at the level of actionable drug discovery, is often a matter of developing accurate stories about how proteins work. This requires understanding the physics of the system, and physics-based computer modeling is a prime tool for that. However, the computational molecular physics (CMP) of proteins has previously been much too expensive and slow. A large fraction of public supercomputing resources worldwide is currently running CMP simulations of biologically relevant systems. We review here the history and status of this large and diverse scientific enterprise. Among other things, protein modeling has driven major computer hardware advances, such as IBM's Blue Gene and DE Shaw's Anton computers. Further, protein modeling has advanced rapidly over 50 years, even slightly faster than Moore's law. We also review an interesting scientific social construct that has arisen around protein modeling: community-wide blind competitions. They have transformed how we test, validate, and improve our computational models of proteins. ADVANCES For 50 years, two approaches to computer modeling have been mainstays for developing stories about protein molecules and their biological actions. (i) Inferences from structure-property relations: Based on the principle that a protein's action depends on its shape, it is possible to use databases of known proteins to learn about unknown proteins. (ii) Computational molecular physics uses force fields of atom-atom interactions, sampled by molecular dynamics (MD), to develop biological action stories that satisfy principles of chemistry and thermodynamics. CMP has traditionally been computationally costly, limited to studying only simple actions of small proteins. But CMP has recently advanced enormously. (i) Force fields and their corresponding solvent models are now sufficiently accurate at capturing the molecular interactions, and conformational searching and sampling methods are sufficiently fast, that CMP is able to model, fairly accurately, protein actions on time scales longer than microseconds, and sometimes milliseconds. So, we are now accessing important biological events, such as protein folding, unbinding, allosteric change, and assembly. (ii) Just as car races do for auto manufacturers, communal blind tests such as protein structure-prediction events are giving protein modelers a shared evaluation venue for improving our methods. CMP methods are now competing and often doing quite well. (iii) New methods are harnessing external information—like experimental structural data—to accelerate CMP, notably, while preserving proper physics. What are we learning? For one thing, a long-standing hypothesis is that proteins fold by multiple different microscopic routes, a story that is too granular to learn from experiments alone. CMP recently affirmed this principle while giving accurate and testable microscopic details, protein by protein. In addition, CMP is now contributing to physico-chemical drug design. Structure-based methods of drug discovery have long been able to discern what small-molecule drug candidates might bind to a given target protein and where on the protein they might bind. However, such methods don't reveal some all-important physical properties needed for drug discovery campaigns—the affinities and the on- and off-rates of the ligand binding to the protein. CMP is beginning to compute these properties accurately. A third example is shown in the figure. It shows the spike protein of severe acute respiratory syndrome coronavirus 2(SARS-CoV-2), the causative agent of today's coronavirus disease 2019 (COVID-19) pandemic. A large, hinge-like movement of this sizable protein is the critical action needed for the virus to enter and infect the human cell. The only way to see the details of this motion—to attempt to block it with drugs—is by CMP. The figure shows CMP simulation results of three dynamical states of this motion. OUTLOOK A cell's behavior is due to the actions of its thousands of different proteins. Every protein has its own story to tell. CMP is a granular and principled tool that is able to discover those stories. CMP is now being tested and improved through blind communal validations. It is attacking ever larger proteins, exploring increasingly bigger and slower motions, and with ever more accurate physics. We are reaching a physical understanding of biology at the microscopic level as CMP reveals causations and forces, step-by-step actions in space and time, conformational distributions along the way, and important physical quantities such as free energies, rates, and equilibrium constants. CMP modeling of COVID-19 infecting the human cell. SARS-CoV-2 spike glycoprotein (green, with its glycan shield in yellow) attaching to the human angiotensin-converting enzyme 2 (ACE2) receptor protein (purple) through its spike receptor-binding domain (red). (Left) The receptor binding domain (RBD) is hidden. (Middle) The RBD is open and accessible. (Right) The RBD binds human ACE2 receptor. This is followed by a cascade of larger conformational changes in the spike protein, leading to viral fusion to the human host cell. Credit: Lucy Fallon Every protein has a story—how it folds, what it binds, its biological actions, and how it misbehaves in aging or disease. Stories are often inferred from a protein’s shape (i.e., its structure). But increasingly, stories are told using computational molecular physics (CMP). CMP is rooted in the principled physics of driving forces and reveals granular detail of conformational populations in space and time. Recent advances are accessing longer time scales, larger actions, and blind testing, enabling more of biology’s stories to be told in the language of atomistic physics.

中文翻译:

通过物理学讲蛋白质故事

了解是什么驱动蛋白质 计算分子物理学 (CMP) 旨在利用物理定律不仅了解静态结构,还了解生物分子的运动和行为。将 CMP 应用于蛋白质需要简化物理模型或运行比生物活动时间尺度更短的模拟。布里尼等人。回顾了将 CMP 移动到与生物事件(如蛋白质折叠、配体解结合和一些构象变化)相匹配的时间尺度的进展。他们还强调了盲人竞赛在推动该领域向前发展方面的作用。深度学习方法等新方法可能会使 CMP 成为描述作用中蛋白质的越来越强大的工具。科学,这个问题 p。eaaz3041 背景了解生物学,特别是在可操作的药物发现层面,通常是开发关于蛋白质如何工作的准确故事。这需要了解系统的物理特性,而基于物理的计算机建模是实现这一目标的主要工具。然而,蛋白质的计算分子物理学 (CMP) 以前过于昂贵和缓慢。全球大部分公共超级计算资源目前正在运行生物相关系统的 CMP 模拟。我们在这里回顾了这个庞大而多样化的科学事业的历史和地位。其中,蛋白质建模推动了主要的计算机硬件进步,例如 IBM 的 Blue Gene 和 DE Shaw 的 Anton 计算机。此外,蛋白质建模在 50 多年来发展迅速,甚至比摩尔定律略快。我们还回顾了围绕蛋白质建模出现的一个有趣的科学社会结构:社区范围的盲人竞赛。它们改变了我们测试、验证和改进蛋白质计算模型的方式。进展 50 年来,两种计算机建模方法一直是开发有关蛋白质分子及其生物作用的故事的主要方法。(i) 结构-性质关系的推论:基于蛋白质的作用取决于其形状的原理,可以使用已知蛋白质的数据库来了解未知蛋白质。(ii) 计算分子物理学使用原子 - 原子相互作用的力场,通过分子动力学 (MD) 进行采样,以开发满足化学和热力学原理的生物动作故事。CMP 历来计算成本高,仅限于研究小蛋白质的简单作用。但 CMP 最近取得了巨大进步。(i) 力场及其相应的溶剂模型现在在捕获分子相互作用方面足够准确,并且构象搜索和采样方法足够快,以至于 CMP 能够相当准确地模拟时间尺度上超过微秒的蛋白质作用,并且有时是毫秒。因此,我们现在正在访问重要的生物事件,例如蛋白质折叠、解绑、变构变化和组装。(ii) 就像汽车比赛对汽车制造商所做的那样,诸如蛋白质结构预测事件之类的公共盲测为蛋白质建模者提供了一个共同的评估场所,以改进我们的方法。CMP 方法现在竞争激烈,而且通常效果很好。(iii) 新方法正在利用外部信息——如实验结构数据——来加速 CMP,特别是,同时保留适当的物理特性。我们在学习什么?一方面,一个长期存在的假设是蛋白质通过多种不同的微观路径折叠,这个故事过于精细,无法仅从实验中学习。CMP 最近确认了这一原理,同时提供了一个蛋白质一个蛋白质的准确和可测试的微观细节。此外,CMP 现在正在为物理化学药物设计做出贡献。基于结构的药物发现方法长期以来一直能够辨别哪些小分子候选药物可能与给定的靶蛋白结合,以及它们可能结合在蛋白质上的哪个位置。然而,这样的方法不 t 揭示了药物发现活动所需的一些非常重要的物理特性——配体与蛋白质结合的亲和力和开/关率。CMP 开始准确计算这些属性。图中显示了第三个示例。它显示了严重急性呼吸系统综合症冠状病毒 2 (SARS-CoV-2) 的刺突蛋白,它是当今 2019 年冠状病毒病 (COVID-19) 大流行的病原体。这种相当大的蛋白质的大型铰链状运动是病毒进入并感染人体细胞所需的关键动作。查看此运动细节的唯一方法——尝试用药物阻止它——是通过 CMP。该图显示了该运动的三种动力学状态的 CMP 模拟结果。展望一个细胞的行为是由于其数千种不同蛋白质的作用。每种蛋白质都有自己的故事要讲。CMP 是一种细粒度的、有原则的工具,能够发现这些故事。CMP 现在正在通过盲目的公共验证进行测试和改进。它正在攻击越来越大的蛋白质,探索越来越大和越来越慢的运动,以及越来越准确的物理学。随着 CMP 揭示因果关系和力、空间和时间中的逐步作用、沿途的构象分布以及重要的物理量(例如自由能、速率和平衡常数),我们正在微观层面上对生物学进行物理理解. COVID-19感染人体细胞的CMP模型。SARS-CoV-2 刺突糖蛋白(绿色、其聚糖盾为黄色)通过其尖峰受体结合域(红色)连接到人血管紧张素转换酶 2 (ACE2) 受体蛋白(紫色)。(左)受体结合域 (RBD) 被隐藏。(中)RBD 是开放且可访问的。(右)RBD 与人类 ACE2 受体结合。随后是刺突蛋白的一系列较大的构象变化,导致病毒与人类宿主细胞融合。图片来源:Lucy Fallon 每一种蛋白质都有一个故事——它如何折叠、它结合什么、它的生物学行为以及它在衰老或疾病中的行为方式。故事通常是从蛋白质的形状(即它的结构)推断出来的。但越来越多的故事是使用计算分子物理学 (CMP) 来讲述的。CMP 植根于驱动力的原理物理学,揭示了空间和时间中构象群的细粒度细节。最近的进展是获得更长的时间尺度、更大的行动和盲测,使更多的生物学故事能够用原子物理学的语言讲述。
更新日期:2020-11-26
down
wechat
bug