Modeling protein structures with the coarse-grained UNRES force field in the CASP14 experiment
Graphical abstract
Introduction
Computer simulations of proteins and their complexes play an increasing role in biophysics, biochemistry, and biomedicine, having such practical applications as drug design [[1], [2], [3], [4]]. Despite the development of high-speed computers, especially the ANTON supercomputer dedicated to run Molecular Dynamics (MD) simulations [5,6] and Graphics Processing Unit (GPU)-based computers [7,8], long-time all-atom simulations are possible only for relatively small proteins, comprising about 100 residues. Therefore, multiscale modeling is usually the method of choice, in which different parts of the system are treated at appropriate resolution. An important part of the multiscale approach is coarse-grained modeling, in which groups of atoms are merged into extended interaction sites [9]. Owing to the elimination of the fast degrees of freedom, coarse graining enables us to extend the time scale of simulations by several orders of magnitude [10].
The quality of the force field is a key issue, especially if long-term protein dynamics and protein-protein or protein-ligand binding is concerned. This is especially important when using coarse-grained models, because reduction of representation usually implies more sophisticated interaction potentials [11,12], which are not so easy to parameterize. In particular, the site-site interaction potentials of most coarse-grained force fields are usually too “sticky”, this resulting in too compact modeled structures [[13], [14], [15], [16], [17]]. Assessment of the force field is, therefore, necessary. One of the hardest tests is that of the ability of a force field to reproduce the native structures of proteins. The Community Wide Experiments on the Critical Assessment of Techniques for Protein Structure Prediction (CASP), conducted since 1992 every other year, enable protein-structure modelers to test their approaches with the proteins, the structures of which had not been solved at prediction time and, consequently, provide an impartial and unbiased test. Knowledge-based approaches, especially the AlphaFold approach developed by DeepMind [18,19], which scored tremendous success in CASP13 and effectively solved the problem of protein-structure prediction for single-domain proteins in CASP14, have the upper hand in the protein-structure prediction as such. Nevertheless, the CASP experiments are an ideal means to test the force fields.
In the last years, we have been developing the physics-based UNited RESidue (UNRES) force field for studying the structure, dynamics, and thermodynamics of proteins and protein complexes [[20], [21], [22], [23], [24], [25], [26]]. UNRES is a highly-reduced physics-based model with only two interaction sites per residue. Its effective energy function is defined as a cluster-cumulant expansion of the potential of mean force of a protein in water [21,25]. The solvent is implicit in UNRES and interactions with it are included in the effective potentials. Recently, we developed a scale-consistent theory of force-field derivation, owing to which the atomic details of a system are implicitly included in the resulting coarse-grained effective energy function [12,25]. This feature solves part of the problem of force-field “stickiness” pointed out in ref. 17, because the force field contains explicit terms that couple the backbone-local and backbone-electrostatic conformational states, which prevent too compact local chain fragments [26]. On the other hand, part of the “stickiness” problem in UNRES, as well as in other coarse-grained force fields, is likely to be caused by the absence of explicit terms that account for the transfer of an interaction site from the solvent to protein inside. This problem is now being addressed in our laboratory. The upgraded UNRES force field was calibrated with 9 training proteins of various structural classes [26]; this recent version has been termed the NEWCT-9P force field. This force field has already been tested in CASP13, demonstrating significant improvement over the previous versions of UNRES in the ab initio, as well as bioinformatics- and data-assisted prediction [[27], [28], [29], [30], [31]]. We used our prediction protocol [27], which is based on Multiplexed Replica Exchange Molecular Dynamics (MREMD) [32] simulations with UNRES [33]. The results of CASP13 demonstrated that UNRES already performs reasonably well, the main problem, compared to knowledge-based approaches, being its coarser resolution. Therefore, before CASP14, we focused on developing UNRES to treat very large systems and not on upgrading the force field. A concise description of UNRES and its recent modifications is provided in section S-1 of the Supplementary Material.
In this paper we report the performance of UNRES extended to large protein systems with the NEWCT-9P force field in the CASP14 blind-prediction experiment in the ab initio, contact-assisted, and template-assisted modes. In each mode, we also carried out Nuclear Magnetic Resonance (NMR)- and Small Angle X-Ray Scattering (SAXS)-data-assisted predictions, which were very limited this time. In what follows we describe the prediction methodology that we used in CASP14 (section 2) and the results obtained (section 3). The conclusions from the study are summarized in section 4.
Section snippets
Prediction protocol
We used the protocols for protein-structure prediction without bioinformatics input [27,30] and with using the consensus fragments derived from server predictions [28,34]. The procedure consists of five stages. In stage 1, restraints to assist prediction are prepared. The restraint-penalty functions have been described in detail in our earlier work [30,31,35] and are also summarized in section S-2 of the Supplementary Material. In stage 2, the UNRES implementation [22,33] of MREMD [36], with
Results and discussion
All three UNRES-based groups submitted models of 63 regular single-chain targets, which were not cancelled. We processed a total 16 (out of 22 total) oligomeric targets, the UNRES group submitting the models of 15, the UNRES-contact group of 10, and the UNRES-template group of 16 of them. All three groups submitted the models of all 3 data-assisted targets, 5 models of each; however, the SAXS assisted target (S1063) was cancelled and the models of the first NMR-assisted target (N1077) were not
Conclusions
We tested the ability of the UNRES force field to predict protein structures in unassisted and bioinformatics-assisted mode in the CASP14 experiment. In this CASP experiment, the number of experimental-data-assisted targets was very limited this time (1 target with structure solved by the completion of CASP14) and, therefore, UNRES could not be reasonably tested in this category. UNRES was upgraded to handle very large targets, which enabled us to simulate the 15,160-residue 20-mer (H1081) and
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
This work was supported by grants UMO-2017/25/B/ST4/01026 (to AL), UMO-2017/27/B/ST4/00926 (to AKS), UMO-2017/26/M/ST4/00044 (to CC), UMO-2018/30/E/ST4/00037 (to SAS), and UMO-2018/31/N/ST4/01677 (to KKB) from the National Science Center of Poland (Narodowe Centrum Nauki). Computational resources were provided by (a) the Interdisciplinary Center of Mathematical and Computer Modeling (ICM) the University of Warsaw under grants No. GA76–11, GB71-18, and GA76-17, (b) the Centre of Informatics -
References (66)
Protein-protein docking: from interaction to interactome
Biophys. J.
(2014)- et al.
Mist: a simple and efficient molecular dynamics abstraction library for integrator development
Comput. Phys. Commun.
(2019) - et al.
Scale-consistent approach to the derivation of coarse-grained force fields for simulating structure, dynamics, and thermodynamics of biopolymers
- et al.
Tuning the hydrophobicity of a coarse grained model of 1,2-dipalmitoyl-sn-glycero-3-phosphatidylcholine using the experimental octanol-water partition coefficient
J. Mol. Liq.
(2020) - et al.
Use of the UNRES force field in template-based prediction of protein structures and the refinement of server models: test with CASP12 targets
J. Mol. Graph. Model.
(2018) - et al.
Evaluation of the scale-consistent UNRES force field in template-free prediction of protein structures in the CASP13 experiment
J. Mol. Graph. Model.
(2019) - et al.
Multiplexed-replica exchange molecular dynamics method for protein folding simulation
Biophys. J.
(2003) - et al.
A coarse-grained Langevin molecular dynamics approach to protein structure reproduction
Chem. Phys. Lett.
(2005) - et al.
A coarse-grained Langevin molecular dynamics approach to de novo protein structure prediction
Biochem. Biophys. Res. Commun.
(2008) Parallel tempering algorithm for conformational studies of biological molecules
Chem. Phys. Lett.
(1997)
Integration of QUARK and I-TASSER for ab initio protein structure prediction in CASP11
Proteins
A completely reimplemented MPI bioinformatics toolkit with a new HHpred server at its core
J. Mol. Biol.
Molecular dynamics simulations and drug discovery
BMC Biol.
A review on applications of computational methods in drug screening and design
Molecules
Fast identification of possible drug treatment of coronavirus disease-19 (COVID-19) through computational drug repurposing study
J. Chem. Inf. Model.
Anton, a special-purpose machine for molecular dynamics simulation
Commun. ACM
Structure and dynamics of an unfolded protein examined by molecular dynamics simulation
J. Am. Chem. Soc.
Accelerating molecular dynamic simulation on graphics processing units
J. Comput. Chem.
Coarse-grained protein models and their applications
Chem. Rev.
Molecular dynamics with the united-residue (UNRES) model of polypeptide chains. II. Langevin and Berendsen-bath dynamics and tests on model α-helical systems
J. Phys. Chem. B
Coarse-Graining of Condensed Phase and Biomolecular Systems
Toward optimized potential functions for protein-protein interactions in aqueous solutions: osmotic second virial coefficient calculations using the MARTINI coarse-grained force field
J. Chem. Theor. Comput.
Overcoming the limitations of the MARTINI force field in simulations of polysaccharides
J. Chem. Theor. Comput.
The lipophilicity of coarse-grained cholesterol models
J. Chem. Inf. Model.
Recent open issues in coarse grained force fields
J. Chem. Inf. Model.
Improved protein structure prediction using potentials from deep learning
Nature
‘it will change averything’: ai makes gigantic leap in solving protein structures
Nature
A united-residue force field for off-lattice protein-structure simulations. i. functional forms and parameters of long-range side-chain interaction potentials from protein crystal data
J. Comput. Chem.
Cumulant-based expressions for the multibody terms for the correlation between local and electrostatic interactions in the united-residue force field
J. Chem. Phys.
Modification and optimization of the united-residue (UNRES) potential energy function for canonical simulations. I. Temperature dependence of the effective energy function and tests of the optimization method with single training proteins
J. Phys. Chem. B
Simulation of protein structure and dynamics with the coarse-grained UNRES force field
A unified coarse-grained model of biological macromolecules based on mean-field multipole-multipole interactions
J. Mol. Model.
A general method for the derivation of the functional forms of the effective energy terms in coarse-grained energy functions of polymers. I. Backbone potentials of coarse-grained polypeptide chains
J. Chem. Phys.
Cited by (16)
Multi-GPU UNRES for scalable coarse-grained simulations of very large protein systems
2024, Computer Physics CommunicationsEnergy landscapes for proteins described by the UNRES coarse-grained potential
2023, Biophysical ChemistryMultilevel Framework for Analysis of Protein Folding Involving Disulfide Bond Formation
2024, Journal of Physical Chemistry BPragmatic Coarse-Graining of Proteins: Models and Applications
2023, Journal of Chemical Theory and ComputationLong-time scale simulations of virus-like particles from three human-norovirus strains
2023, Journal of Computational Chemistry