Brought to you by:
Paper

On the vibrational free energy of hydrated proteins

Published 10 March 2021 © 2021 IOP Publishing Ltd
, , Citation Yves-Henri Sanejouand 2021 Phys. Biol. 18 036003 DOI 10.1088/1478-3975/abdc0f

1478-3975/18/3/036003

Abstract

When the hydration shell of a protein is filled with at least 0.6 gram of water per gram of protein, a significant anti-correlation between the vibrational free energy and the potential energy of energy-minimized conformers is observed. This means that low potential energy, well-hydrated, protein conformers tend to be more rigid than high-energy ones. On the other hand, in the case of CASP target 624, when its hydration shell is filled, a significant energy gap is observed between the crystal structure and the best conformers proposed during the prediction experiment, strongly suggesting that including explicit water molecules may help identifying unlikely conformers among good-looking ones.

Export citation and abstract BibTeX RIS

1. Introduction

The protein folding problem [13] is important both on a practical and on a theoretical side. On a practical side, solving this problem would allow us to determine structures that can not be obtained either by x-ray crystallography, nuclear magnetic resonance or cryoelectron microscopy. On a theoretical side, it would definitely demonstrate our ability to model the free energy surface of a protein accurately enough, at least as far as the likelihood of its major conformers is concerned.

Some impressive successes have already been reported. For instance, during the 13th CASP experiment, in the case of the best model proposed for a 354 residue domain of a xylan acetyltransferase with no known homologue in the PDB, most of the structure core was modeled to a Cα accuracy of better than 2 Å [4]. On the other hand, using standard classical forcefields and straightforward, sub-millisecond long molecular dynamics (MD) simulations, the group of David Shaw was able to fold 12 small fast-folding proteins with, for half of them, a Cα accuracy of better than 1.5 Å [5].

However, with the latter, so-called ab initio approach [6, 7], structural stability problems started to pop up when larger proteins were considered [8], casting doubts on the accuracy of the underlying potential energy surface.

Such problems could be due to the specific forcefield used in this later work, namely, the CHARMM22* forcefield [9, 10]. However, they could also come from the too crude description of electrostatical interactions assumed in most classical forcefields used to study macromolecular systems with, in particular, no explicit contribution of atomic and bond polarizabilities [11]. Also, standard water models could prove too simple to allow both for a correct behavior of bulk water and a correct description of water-protein interactions [12].

Work is indeed in progress along these lines. For instance, AMOEBA [13], a polarizable force field, has recently been parallelized [14]. On the other hand, TIP4P-D, a new water model [15], has been developed, the Amber 99SB-ILDN protein forcefield being optimized accordingly, allowing to substantially improve on the state-of-the-art accuracy for simulations of disordered proteins without sacrificing accuracy for folded proteins [16].

In the present study, it is however assumed that such efforts will not prove enough, and this because of another neglected term, namely, the vibrational free energy of the system which, in principle, should be taken into account for each point of the potential energy surface.

A number of studies have already pointed out that this term may prove important, noteworthy for determining thermodynamical quantities involved in protein folding [17] or in protein–protein [1821] and protein-ligand [2224] recognition processes.

The present study confirms that this term can indeed have a significant contribution, at least when the whole hydration shell of the protein is considered. In particular, if it were taken into account, low potential energy conformers would be less likely.

2. Methods

2.1. MD simulations

In order to sample the configurational space, MD simulations were performed with Gromacs [25] version 4.6.3, using the Amber 99SB-ILDN forcefield [26] and the TIP3P water model [27]. Short range electrostatic and van der Waals interactions were cut off at 12 Å, long-range electrostatics being handled through the particle mesh Ewald method [28].

Each considered protein was embedded in a water box, its boundaries being at least 10 Å away from the protein. Sodium and chloride ions were added so as to neutralize the charge of the system and to reach a salt concentration of 150 mM L−1, as often done [2931].

The system was then relaxed using steepest descent minimization, with harmonic restraints (force constant of 1000 kJ mol−1 nm−1) on protein heavy atoms, until a maximum force threshold of 1000 kJ mol−1 nm−1 was reached. The solvent was next equilibrated at 300 K, first during 500 ps with a control of both volume and temperature, using Berendsen thermostat [32] and a coupling constant of 0.1 ps, then during 500 ps with a control of both pressure and temperature, using Parrinello–Rahman barostat [33] and a coupling constant of 2 ps. Finally, restraints were removed and 100 ns of simulation at 300 K were performed, with a timestep of 2 fs, all bonds being constrained with the LINCS algorithm [34].

2.2. Vibrational free energy

From each simulation, 100 snapshots were picked (one per nanosecond), the nclos closest water molecules were retained (nclos ranging between 0 and 1000) and the corresponding hydrated conformers were energy-minimized with Tinker [35] version 6.2, using the Amber 99SB forcefield, L-BFGS minimization being performed until a gradient of 0.001 kcal mole−1 Å−1 was reached (with the minimize program), followed by truncated Newton minimization down to a gradient of 0.000 01 kcal mole−1 Å−1 (with the newton program), such a strict convergence criterion being required in order to obtain the expected six zero-frequencies [3638].

For each energy-minimized conformer, normal mode frequencies, νi being the ith one, were then obtained by diagonalizing the mass-weighted Hessian matrix (with the vibrate program), allowing to obtain Avib, the vibrational free energy, which is such that [18]:

Equation (1)

where ndof is the number of degrees of freedom (ndof = 3N − 6, N being the number of atoms of the system), h and k, respectively, the Planck and Boltzmann constants, T, the temperature, being set to 300 K.

In several previous studies (e.g. [19, 21, 22, 39]), only the contribution of the vibrational entropy to Avib was considered. It is given by [18]:

Equation (2)

while the vibrational enthalpy is a follows:

Equation (3)

the first sum corresponding to the zero-point energy.

3. Results

3.1. BPTI conformers

As shown in figure 1(a), when 100 conformers of the BPTI picked from a 100 ns MD trajectory (see section 2.1) are energy-minimized, the standard deviation of their vibrational free energy (equation (1)) is much lower (4.2 kcal mole−1) than the standard deviation of their potential energy (18.9 kcal mole−1). However, when the energy of each BPTI conformer is minimized together with the 1000 water molecules that are the closest to this conformer, that is, all those that are at most ≈8 Å away from a protein heavy atom, a significant anti-correlation (−0.79) between the vibrational free energy and the potential energy of the energy-minimized conformers is observed (see figure 1(b)), meaning that low-energy hydrated conformers of BPTI are on average significantly more rigid than high energy ones.

Figure 1.

Figure 1. Vibrational free energy, as a function of the potential energy of 100 energy-minimized bovine pancreatic trypsin inhibitor (BPTI) conformers. (a) When the energy-minimization is performed in vacuo. (b) When the energy-minimization is performed with the 1000 closest water molecules. Dashed lines: linear fit.

Standard image High-resolution image

In agreement with previous studies [18, 19, 22], and as illustrated in figure 2 by comparing the highest and lowest vibrational free energy conformers of BPTI, most of the variation of the vibrational free energy is due to modes with frequencies below 1000 cm−1. Note however that in this case, while the contribution of the vibrational entropy (equation (2)) is indeed dominant, the contribution of the vibrational enthalpy (equation (3)) is far from being negligible (26% of the difference, in the case of the conformers hydrated with 1000 water molecules), coming mostly from modes with frequencies ranging between 500 cm−1 and 1000 cm−1.

Figure 2.

Figure 2. Cumulative energy difference between the highest and lowest vibrational free energy BPTI conformers, as a function of the mode mean frequency of both conformers. (a) With no water molecule; (b) when their 1000 closest water molecules are taken into account. Plain line: vibrational free energy. Dashed line: contribution of the vibrational entropy (−TΔSvib). Dotted line: zero-point energy.

Standard image High-resolution image

On the other hand, the zero-point energy (equation (3)) represents 48% of the vibrational free energy difference between the pair of hydrated conformers, in line with the hypothesis that this term is a key component of the energy of macromolecular systems [46, 47], like in the case of small molecules [4850].

3.2. Water hydration threshold

In order to assess the importance of the anti-correlation between the vibrational free energy and the potential energy of energy-minimized conformers, a least-square fit of the data was performed, by assuming that:

where Vmin is the minimized potential energy of the considered conformer, a being the slope of the vibrational free energy—potential energy relationship.

Together with the BPTI, four other proteins of various folds and sizes were studied (see figure 3), retaining the 0, 10, 50, 100, 150, 200, 300, 400, 500, 600, 700, 800, 900 or 1000 closest water molecules of each MD conformer for the analysis (see section 2.2).

Figure 3.

Figure 3. Proteins considered. Top left: BPTI (PDB 4PTI [40]). Top right: ubiquitin (PDB 4XOF [41]). Center: tryptophan-cage (PDB 2JOF [42]). Bottom left: thioredoxin (PDB 1ERT [43]). Bottom right: hen egg-white lysozyme (PDB 2VB1 [44]). Drawn with Chimera [45].

Standard image High-resolution image

As shown in figure 4, like for BPTI (figure 1(a)), when no water molecule is retained, the slope of the vibrational free energy—potential energy relationship has a small value, ranging between -0.03, for BPTI, and -0.15, for the lysozyme. On the other hand, when there is more than 3.8 water molecules per amino-acid residue in the considered system, the slope is always below −0.3, confirming that low-energy conformers are on average more rigid than high-energy ones. Moreover, the fact that the value of the slope does not seem to vary significantly when there is more than 3.8 water molecules per residue suggests that this rigidity is mostly due to the interactions between the protein and the water molecules belonging to its hydration shell. However, as illustrated in figure 5 in the case of the lowest (7416 kcal mole−1) and highest (7466 kcal mole−1) vibrational free energy conformers of BPTI, when ≈3.8 water molecules per residue are considered, the network of hydrogen bonds established between the water molecules all around the protein is almost continuous.

Figure 4.

Figure 4. Slope of the vibrational free energy—potential energy linear relationship, as a function of the number of close water molecules taken into account. The vertical (dashed) line corresponds to 0.6 gram of water per gram of protein. Filled circles: BPTI. Filled diamonds: ubiquitin. Stars: tryptophan-cage. Open circles: thioredoxin. Open diamonds: lysozyme.

Standard image High-resolution image
Figure 5.

Figure 5. Lowest (left) and highest (right) vibrational free energy conformers of BPTI, when their 200 closest water molecules are taken into account. Drawn with Chimera [45].

Standard image High-resolution image

Interestingly, 3.8 water molecules per residue corresponds to 0.6 gram of water per gram of protein, that is, the order of magnitude of the amount of tightly bound water molecules found in the hydration shell of proteins [51] by various techniques like calorimetry [52] or microwave dielectric measurements [53, 54], in line with the idea that such water molecules are structural ones, that is, that they actually belong to the protein structure [55], even though their mean residence time can be quite short [5658], namely, on the sub-nanosecond time-scale, in spite of a few notable exceptions [5961].

3.3. Conformers of CASP target 624

In order to assess if the vibrational free energy could prove helpful for pinpointing physically relevant protein conformers, eight models proposed for target 624 during the 9th CASP experiment [62] were analyzed as above, together with the actual crystallographic structure (see figure 6). Target 624 was chosen because it proved possible to refine the two best models proposed (ts172-1 and ts172-4) through MD simulations on the sub-millisecond time-scale, the Cα root-mean-square (RMSD) to the native structure decreasing from ≈5 Å down to about 1 Å [8].

Figure 6.

Figure 6. Models proposed for CASP target 624. (a) ts172-2, (b) ts172-3, (c) ts172-5, (d) ts172-1, (e) crystal structure (chain A of PDB 3NRL), (f) ts172-4, (g) ts236-2, (h) ts484-4, (i) ts345-1. Drawn with Chimera [45].

Standard image High-resolution image

The five models proposed by the group of David Baker (group 172) were included in the present analysis, as well as three models with an RMSD to the native structure of around 7 Å, proposed by three other groups (ts236-2, ts345-1, ts484-4). As shown in figure 7, except ts172-5, all proposed models remain relatively stable during a 100 ns MD simulation, with an RMSD to the initial structure ranging between 2 and 4 Å. Note however that the crystal structure is, according to this criterion, obviously the more stable one, with an RMSD staying below 1.5 Å during the whole simulation.

Figure 7.

Figure 7. Cα root-mean-square displacement, as a function of time, of nine models of CASP target 624. Filled diamonds: crystal structure. Filled squares: ts172-1. Open squares: ts172-2. Open circles: ts172-3. Filled circles: ts172-4. Open diamonds: ts172-5. Pluses: ts236-2. Crosses: ts345-1. Stars: ts484-4.

Standard image High-resolution image

As shown in figure 8, when no water molecule is retained (left column), several models have an average energy similar to, or even better than the average energy obtained starting from the crystal structure, namely: ⟨Vmin⟩ = −0.01 kcal mole−1, for ts172-1, ⟨Vmin + Avib⟩ = −7.5 kcal mole−1, also for ts172-1, +3.3 kcal mole−1, for ts345-1 or +6.1 kcal mole−1, for ts172-2, the averages being obtained with 100 energy-minimized MD conformers (section 2.2), the zero corresponding to the average value obtained with the crystal structure.

Figure 8.

Figure 8. Average energy of eight models of CASP target 624, as a function of the number of close water molecules taken into account. (a) average potential energies obtained after energy-minimization. (b) Average potential energies + vibrational free energies (VFE). The zero (lower dashed line) corresponds to the average value obtained starting from the crystal structure. The upper dashed line, at +25 kcal mole−1, is a guide to the eye. Each average (horizontal bar) is obtained with 100 energy-minimized MD conformers. Averages over +100 kcal mole−1 are not shown.

Standard image High-resolution image

When water molecules close to the conformers are included in the analysis, the gap between the average energy of the proposed models and the average energy obtained starting from the crystal structure increases dramatically, being around 25 kcal mole−1 when the vibrational free energy is considered on top of the potential energy, when nclos ⩾ 200 (see figure 8(b)). However, when only the potential energy is taken into account, the gap is also large (see figure 8(a)), the smallest values being ⟨Vmin⟩ = +9.5 kcal mole−1, for ts484-4 (with nclos = 400) and +14.1 kcal mole−1, also for ts484-4 (with nclos = 300).

4. Conclusion

When the hydration shell of a protein is filled with at least 0.6 gram of water per gram of protein, a significant anti-correlation between the vibrational free energy and the potential energy of energy-minimized conformers is observed (figure 1(b)), the slope of the corresponding linear relationship ranging between −0.3 and −0.4 (figure 4), depending upon the protein and the number of close water molecules taken into account.

This means that low-energy, well-hydrated, conformers tend to be more rigid than high-energy ones. This also means that if the vibrational free energy of each point of the potential energy surface had been taken into account during the MD simulations, the low potential energy conformers, being less likely, would have been observed less frequently.

On the other hand, for CASP target 624, when its hydration shell is filled, a significant energy gap is observed between the crystal structure and the best conformers that were proposed during the prediction experiment (figure 8), strongly suggesting that including explicit water molecules may help identifying unlikely conformers among good-looking ones. However, when the vibrational free energy is added on top of the potential energy, the gap increases in a few cases only, meaning that, for pinpointing unlikely conformers, the average potential energy of energy-minimized, well hydrated, conformers is a powerful enough criterion, at least in the case of CASP target 624.

The present work rely on the hypothesis that studying a sample of potential energy minima, that is, the inherent structures of the system [63], can yield a fair approximation for its average vibrational energy. Though the results thus obtained look promising, such an hypothesis can not be taken for granted. A possible way to check it would be to take advantage of the fact that VFE can be obtained from the vibrational density of states function [64], while the latter can for instance be obtained through the Fourier transform of the velocity autocorrelation function. Such an approach could indeed allow to obtain the average vibrational free energy of an hydrated protein directly from a standard, room temperature, MD simulation [65].

Acknowledgments

I thank Jean Durup for stimulating discussions on this topic, some years ago.

Data availability statement

The data that support the findings of this study are available upon reasonable request from the authors.

Please wait… references are loading.
10.1088/1478-3975/abdc0f