Introduction

Humans and other animals, as well as plants, are plagued by infections caused by viruses. These are parasites that cannot reproduce by themselves and that are incapable of metabolic activity. Instead, after viruses infect cells, they alter cellular molecular machinery so that it produces new viruses, which are then released into the environment. This sequence of events, the viral life cycle, is schematically discussed in Box 1. The aim of the traditional field of structural virology is to achieve a better understanding of the spatial organization of the proteins inside viral particles, an understanding that should increase our ability to develop more effective therapies.

Since the beginnings of structural virology as a field of study, physics has made important contributions. For several decades now, electron microscopy and X-ray diffraction imaging of crystals of viruses have provided high-resolution reconstructions of viral capsids with icosahedral symmetry1. The crystallographic analysis method developed by Donald Caspar and Aaron Klug in the 1960s (ref.2) provided us with a systematic classification of icosahedral capsids in terms of the so-called T-numbers (triangulation numbers), a method that has stood the test of time3. More recently, modern cryo-electron imaging methods, including tomography4,5 and asymmetric reconstructions6,7, have made it possible to reconstruct not only regular capsids, but also asymmetric protein distributions and the enclosed genome by using images of individual viruses.

These important low-temperature imaging methods are limited to the study of static viral structures. However, evidence accumulated towards the end of the twentieth century that viruses and viral capsids are in fact dynamical structures that should be viewed as active ‘nanomachines’. When a virus assembles, a highly dynamical and disorganized state of individual capsid proteins in solution progressively turns into a more ordered, collective multi-protein state of partially and eventually fully closed shells, the viral capsid. Yet these closed capsids are still quite dynamic. Studying the dynamics of viruses called for a new toolbox, in terms of both new probes and new methods of analysis and numerical modelling. We first review some of the dynamic methods that are being applied to study the assembly of empty viral capsids, then focus on the role of genome molecules (RNA/DNA) during assembly, followed by a discussion of studies of the steady-state dynamics of assembled viruses. Box 2 summarizes several experimental techniques that have been used to study viral dynamics in the past decade. In an earlier article3, we reviewed work on the equilibrium physics of viral assembly under static conditions and their mechanical properties.

Viral self-assembly dynamics

In 1955, Heinz Fraenkel-Conrat and Robley Williams discovered that solutions containing tobacco mosaic virus (TMV) capsid proteins and single-stranded (ss) RNA molecules spontaneously produced infectious viral particles (or virions)8,9, a process now known as self-assembly. Many other viruses — both with rod-like helical shapes (such as TMV) and with sphere-like icosahedral shapes10,11,12 — were found to self-assemble under in vitro conditions. Thermodynamic studies led to the construction of instructive phase diagrams for viral assembly, with pH and ionic strength acting as thermodynamic control parameters. For instance, for the capsid proteins of TMV without its genome molecules, increasing ionic strength and reducing pH both serve to weaken the electrostatic repulsive interactions between the amphiphilic capsid proteins, thereby allowing the competing attractive hydrophobic interactions to overcome the repulsion and drive assembly13. TMV-like capsids appear under conditions of low pH. This is not the case under physiological conditions, but adding the viral genome molecules to the solution does tilt the balance and produces infectious viruses (Fig. 1a). In comparison, the capsid proteins of the spherical cowpea chlorotic mottle virus (CCMV) form empty capsids under conditions of low pH but higher ionic strength, but form concentric multi-shells at low pH and lower ionic strength14.

Fig. 1: Assembly of empty capsids.
figure 1

a | Phase diagram of tobacco mosaic virus (TMV) assembly. The red dot signifies cellular conditions. b | Light-scattering experiments yield the Rayleigh ratio R as a function of time for increasing capsid protein concentration (bottom to top) during assembly of human papillomavirus (HPV) virus-like particles. The mass average molecular weight of assembled particles can be determined from R and the concentration of the solution. c | High-speed atomic force microscopy snapshots of reversible capsid lattice assembly for human immunodeficiency virus (HIV) (lower panels). Scale bar, 10 nm. The top image shows a reconstruction of seven hexamers; the centre one is highlighted within a green hexagon. d | Resistive-pulse sensing technique to study hepatitis B virus (HBV) assembly, including complete T = 3 and T = 4 capsids and assembly intermediates (pre-T = 4). T, triangulation number. (Δi/i): normalized pulse amplitudes, with the baseline current i and the adjusted pulse amplitude Δi. Upper panel, integrated counts; lower panel, individual data points. Part a adapted with permission from ref.13. Part b adapted with permission from ref.20. Part c adapted with permission from ref.43. Part d adapted with permission from ref.47.

When empty capsids are present in such a solution, they coexist with a certain concentration of single capsid proteins (monomers) or small protein groups (oligomers). Light-scattering studies15,16 have shown that the concentrations of capsid protein monomers or oligomers and of assembled capsids obey the law of mass action (LMA) of chemical thermodynamics, a direct consequence of the application of the second law of thermodynamics to multi-component solutions in thermodynamic equilibrium. This observation suggested that viral self-assembly could be understood as an equilibrium self-assembly process, similar to the formation of micelles in surfactant-rich solutions17. Insight into the dynamics of capsid assembly was mainly achieved in the past decade, typically involving more advanced probes. We discuss some of these studies, first for the case of assembly of empty capsids and then for the case of assembly of virions.

Assembly dynamics of empty capsids

The first dynamical studies of the assembly of capsids used time-dependent light scattering and turbidity measurements to probe bulk solutions of capsid proteins18. Capsid assembly was initiated by a reduction of the pH or an increase in the salinity15,19. Through this approach, it was found that capsid assembly of human papillomavirus (HPV) typically starts after a certain lag time (Fig. 1b)20. This lag time is understood to be the formation time of a nucleation complex, followed by a ‘downhill’ protein-by-protein extension (or ‘elongation’) process that leads to completed capsids. In the limit of late times, the dependence of the capsid concentrations on the total protein concentration was consistent with the LMA of equilibrium thermodynamics.

Within the classical theory of nucleation and growth21, the nucleation complex corresponds to the critical nucleus, or transition state — that is, a maximum in the free energy of a protein cluster as a function of the number of proteins. The number of proteins or protein groups (capsomers) that constitute the nucleation complex can be determined as the slope of the log of the concentration of complete capsids as a function of the log of the concentration of free proteins measured at a given instant. For the case of HPV, such an analysis suggested that the nucleation complex is a dimer of pentamers20. Note that even hours after initiation the assembly process still is not fully complete, even though the initial lag time is in the range of minutes (Fig. 1b). An example of single-particle imaging of assembly intermediates was provided by a combined electron microscopy and atomic force microscopy (AFM) study of the assembly of the minute virus of mice (MVM)22 that provided insight into the nature of the transient assembly intermediates that interpolate between the monomer and capsid state. These experiments demonstrated the importance of single-particle approaches.

According to the equilibrium self-assembly theory of viral capsids16, capsid assembly is initiated at a critical protein aggregation concentration (the CAC). At the CAC, the critical nucleus is roughly half of a complete shell, which is much larger than the size of the experimentally observed nucleation complexes. Theoretically, the size of a nucleation complex should decrease in size for increasing levels of supersaturation21. So, according to nucleation-and-growth theory, the dynamical experiments are best understood as taking place under conditions of high levels of supersaturation — that is, far from thermodynamic equilibrium. But if that is the case, then how can the equilibrium LMA apply? Moreover, when the protein concentration is reduced after capsid formation, the capsids do not spontaneously disassemble as would happen if the system really did obey the LMA.

Some insight in this strange interplay between equilibrium and non-equilibrium properties can be obtained from a simple model for the kinetics of capsid assembly23 in which the capsid is treated as a dodecahedron composed of 12 pentamers with sticky edges. The capsids assemble in a solution of pentamers. At each assembly step, a pentamer is added to a partial capsid on a location that minimizes the energy of the partial capsid. This leads to a minimum energy assembly pathway E(n) where the index n runs from 1 to 12. Mathematically, the assembly kinetics is similar to that of a one-dimensional (1D) random walk across a non-uniform energy landscape E(n). The cluster size distribution function C(n,t) is the probability that a pentamer is part of a pentamer cluster of size n. This function obeys the master equation24

\(\frac{\partial C(n,t)}{\partial t}=J(n-1,t)-J(n,t)\)

with the probability current J(n,t) given by

\(J(n,t)={k}^{+}(n)C(1,t)C(n,t)-{k}^{-}(n)C(n+1,t).\)

The first term describes the growth of a cluster by the addition of a pentamer from solution with k+(n) the on-rate. The second term describes the loss of a pentamer from a cluster with k(n) the off-rate. The on and off rates are related to the energy landscape E(n) by the detailed balance condition:

\(\frac{{k}^{+}(n)}{{k}^{-}(n)}={c}_{0}{{\rm{e}}}^{-\beta (E(n)-E(n+1))}\)

where c0 is a constant of the order of the inverse volume of a pentamer and β = 1/kBT is the inverse temperature. If one assumes that assembly is diffusion-limited, then the on-rate is a constant k+ that is independent of n and proportional to the pentamer diffusion coefficient and the pentamer size. The detailed balance condition then provides an expression for k(n). The boundary condition for the solution of the master equation at n= 1 is J(1,t) = ΓC(1,t)2 − k(2)C(2,t) with Γ the nucleation rate. If thermal disassembly of a completed capsid is forbidden, then the second boundary condition is that J(12,t) = 0, known as an absorber boundary condition. The initial condition is that C(1,0) = ϕ, with ϕ the total pentamer concentration, while C(n,0) = 0 for n > 1.

Through mathematical and numerical analysis, it can be shown that this model qualitatively reproduces the capsid assembly kinetics shown in Fig. 1b23,24. The dynamics has the character of a shock front of assembly intermediates that propagates in configuration space from the two-pentamer nucleation complex to the assembled capsid. The arrival time of the shock front at n= 12 corresponds to the lag time measured experimentally. It also can be demonstrated that for later times the ratio C(12,t)/C(1,t) resembles a version of the LMA, the ‘quasi-LMA’. Finally, for high nucleation rates Γ, at late times a state is produced with a large number of incomplete capsids, consistent with experimental reports.

The nature of the nucleation complex is of central importance for viral assembly, and it has been studied by different experimental methods. Native ion-mobility mass spectrometry experiments indicate that nucleation complexes of the norovirus and the hepatitis B virus (HBV) possess a five-fold symmetry axis25. Charge-detection mass spectrometry (CDMS) at the level of individual particles has been used to study the extension stage of the capsid growth, past the level of the critical nucleus. It was found that during the growth process the number of HBV proteins per particle exceeds the number of proteins in the capsid of the native T = 4 HBV virion26. In other words, late intermediates can possess a larger mass than the final closed capsid. The process of final annealing of these late intermediates with the shell shrinking to the final structure takes much longer than the initial assembly reaction. A similar case of an intermediate state with excess size is encountered for the assembly of CCMV27. One implication of these findings is that viral assembly is not just a simple protein-by-protein addition process but also involves large-scale reorganization processes of the capsid, a theme we will return to.

Kinetic trapping plays a prominent role in capsid assembly. Mutations of capsid proteins can generate kinetic traps either by suppressing assembly altogether or by producing aberrant capsid structures28,29. Kinetic trapping has also been observed in CDMS experiments on HBV assembly30. When assembly occurs in the presence of higher salt concentrations, electrostatic repulsion is weakened and thus assembly rates are increased. Such assembly rates lead to kinetically trapped intermediates. Kinetic trapping and formation of malformed capsids has been reproduced by numerical simulations of capsid assembly31,32,33. Kinetic trapping becomes progressively more prominent when the size of the capsid increases.

There is a useful analogy between kinetic trapping during viral assembly and kinetic trapping during protein folding. The total number of ways in which a polypeptide molecule can be folded together is far too large for all of them to be sampled by random thermal fluctuations, at least on reasonable timescales (Levinthal’s paradox)34. Similarly, there is an enormous number of ways that protein shells of various shapes and sizes can be assembled from protein monomers or oligomers. For the case of protein folding, natural selection is believed to produce kinetic assembly pathways that allow folding on reasonable timescales by avoiding kinetic traps35,36. This suggests that the capsid assembly process also may have evolved kinetic assembly pathways that avoid deep kinetic traps. Mutations that alter the assembly pathway can cause deep kinetic traps to reappear. Kinetic assembly pathways have been identified in HBV, by a combination of time-dependent small-angle X-ray scattering (SAXS) and umbrella Monte Carlo simulation with maximum entropy optimization37. Importantly, these kinetic assembly pathways appear only over narrow intervals of parameter space. Assuming that the native capsid is the lowest free-energy state, thermal annealing provides an escape route out of shallow kinetic traps through thermal activation, but thermal annealing is effective only for relatively weak protein–protein interactions. This suggests that protein–protein interaction energies should be in the range of no more than a few kBT (where T is temperature here) during initial assembly. Apart from HBV, the assembly of bacteriophage P22 procapsids and empty capsids of norovirus and CCMV has been studied by SAXS38,39,40, with relatively large on-path assembly intermediates reported. For norovirus, the intermediate assembly structure is a double pentamer of dimers, connected by one dimer39, whereas for CCMV, half-formed capsids with a lifetime of several seconds have been found40.

The human immunodeficiency virus (HIV) capsid consists of ~1,000–1,500 copies of the viral capsid protein41, and the self-assembly kinetics of a 2D lattice of this capsid protein has been examined by AFM42. Assembly starts at specific sites on the surface, followed by the growth of the lattice and fusion of the different patches, eventually producing full coverage of the surface. By increasing both the spatial and temporal resolution, it is possible to identify the nature of the nucleation complex and to follow assembly in real time43. The necessary increase in resolution has been achieved by using high-speed AFM (HS-AFM)44,45,46. The nucleation complex of the growing lattice was clearly identified as a capsid protein hexamer. Furthermore, it was shown how monomers, dimers and trimers of the capsid protein attach to, and detach from, the growing lattice (Fig. 1c). This direct visualization of the 2D assembly process revealed that the self-assembly of a lattice of capsid proteins takes place via multiple stochastic pathways.

Other new probes have also been used to study assembly. For example, single-particle resistive-pulse sensing was used to study how HBV assembly depends on ionic strength47, and complete capsids and assembly intermediates have been identified (Fig. 1d). Subsequently, this technique was applied to test the effect of antivirals that specifically target the capsid assembly of HBV48. For weak protein–protein interactions, the heteroaryldihydropyrimidine (HAP) antiviral agents induce the formation of aberrant capsid structures. However, strong protein interactions cannot be overcome by the HAPs, in which case the native T = 4 icosahedral capsid is formed. Such results show that understanding (and potentially interfering in) kinetic assembly pathways can have implications for the development of antiviral drugs. In particular, as there is a considerable structural similarity in the architecture of the capsid proteins between most icosahedral viruses (the ‘Swiss roll’ motif, also called the ‘jelly roll’ motif49), such interference could be a promising general ‘nanomedicine’ strategy for different viral families.

Capsid assembly around the viral genome

The assembly of a virion — the fully infectious viral particle containing the viral genome molecule(s) — is a more delicate enterprise than the assembly of an empty capsid. In the infected cell, the viral genome molecules are surrounded by host genome material, and only the virus genome should be packaged in the viral shell. The associated search process to locate the viral DNA or RNA is often carried out by the capsid proteins, in order to specifically package this viral genome. The assembly of most double-stranded (ds) DNA and RNA viruses involves the insertion of viral genome molecules into pre-assembled, empty procapsids. This insertion is an active process driven by a powerful rotary molecular motor that derives its energy from ATP (adenosine triphosphate) hydrolysis50,51. The viral DNA molecules are marked by an enzyme (terminase) that becomes a component of the packaging motor. Well-studied examples of this process are provided by the dsDNA viruses that infect bacteria (bacteriophages) and by some animal viruses such as the herpesvirus and adenoviruses50,51. The assembled virion is characterized by large osmotic pressures (in the range of 10 atm) exerted by the highly compacted genome on the capsid wall52. This pressure probably plays an important role in the life cycle of the virus by assisting the release of genome molecules into host cells. Both dsDNA insertion and release have been successfully replicated under laboratory conditions. Packaging studies of bacteriophage DNA have been done with optical tweezers51,53,54,55 whereas genome release studies of archaeal, prokaryotic and eukaryotic viruses used bulk and other single-molecule approaches52,56,57,58.

Apart from active packaging, the recruitment of genome molecules can also occur as a passive co-assembly process, driven by a form of chemical affinity between genome molecules and capsid proteins. In this process, there is no build-up of large osmotic pressures. It is the prevalent assembly process of viruses with ss genome molecules as well as for some dsDNA viruses such as simian virus 40 (SV40). Co-assembly was first studied in TMV, leading to the suggestion that ssRNA genome molecules can act as assembly templates. Whereas TMV capsid proteins can form oligomers when the pH and salinity level is approximately at a physiological level, electrostatic repulsion is — under such conditions — too strong to allow formation of empty capsids. However, the association of negatively charged RNA nucleotides with the positively charged amino acid residues that line the interior of disk-like protein oligomers weakens the electrostatic repulsion and tips the free energy balance towards assembly. Stacking the TMV protein disks on top of each other produces a helical capsid with the RNA molecule incorporated in the capsid interior (Fig. 2a). The length of the RNA molecule acts as a caliper that determines the length of the virus. For icosahedral viruses, genome length also can alter particle size, as exemplified by later studies on SV4059 and CCMV60. In both cases, two different capsid sizes were observed, depending on the length of the encapsidated genome. These examples show that the RNA genome cargo both aids viral self-assembly and influences the outcome of assembly.

Fig. 2: Assembly around a genome.
figure 2

a | Tobacco mosaic virus (TMV) capsid proteins (CP) assemble into A-proteins and subsequently disks. By insertion of the RNA (black thread in main image; red in inset) into a protein disk, a conformational change to a helical ‘lock-washer’ configuration occurs. The virion grows by the sequential addition of protein disks and A-proteins. b | Simulated assembly around a genome (red) following an en masse (upper) or nucleation-and-growth pathway where the genome acts as an ‘antenna’ (lower). c | Assembly traces of individual MS2 bacteriophage particles for different capsid protein concentrations, recorded by interferometric scattering microscopy. Inset: images of a single assembling particle. d | Fluorescence optical tweezers measurements of DNA packaging by synthetic capsid proteins. Progressive virus-like particle (VLP) assembly decreases the DNA end-to-end distance and pulls the beads together (left panel). Scale bar 2.5 µm. The fluorescence intensity profiles reflect the increase in bound polypeptides (right panel). e | Acoustic force spectroscopy data reveals the decrease in DNA contour length during assembly of simian virus 40 (SV40) VLPs. f | Electron micrograph (top) and corresponding schematic (bottom) of cowpea chlorotic mottle virus (CCMV) assembly around brome mosaic virus (BMV) genome. The images show (left to right): RNA without protein; decrease in size of the complex after adding capsid protein to the RNA; capsid formation after a reduction in pH. g | Time evolution of mean number of subunits 〈Nup (black circles) and radius of gyration Rg (grey circles) after mixing of CCMV capsid proteins and genome at a mass ratio ρ of 6:1. The dashed lines are decay functions for the binding time τbind (red) and the structural relaxation time τstruc (blue). Error bars indicate standard error of the mean. Part a adapted with permission from ref.167 and T. Splettstößer. Part b adapted with permission from ref.67. Part c adapted with permission from ref.70. Part d adapted with permission from ref.72. Part e adapted with permission from ref.73. Part f adapted with permission from ref.69 and ref.168. Part g adapted with permission from ref.77.

The proper assembly of a TMV particle requires that viral RNA molecules are distinguished from host RNA molecules, such as messenger RNA (mRNA) molecules. Packaging signals are short, evolutionarily conserved RNA sequences with specific affinity for the capsid proteins of a virus. The initiation of virion assembly starts when capsid proteins bind to the packaging signal(s) of a viral RNA molecule. The subsequent growth process (also known as elongation) may involve additional packaging signals, as in the case for the well-studied MS2 bacteriophage virus, but it can also be driven entirely by the generic electrostatic affinity of the negatively charged ssRNA molecules for positively charged amino acid residues of capsid proteins. An example of the latter case is TMV, which has only a single packaging signal of 20 nucleotides.

The combination of specific and nonspecific protein–genome affinity with the hydrophobic affinity between capsid proteins drives the co-assembly process. Box 3 summarizes common assembly models. Below, we discuss generic aspects of the co-assembly process and then virus-specific aspects.

Non-specific assembly

The capsid proteins of CCMV assemble around non-genomic ssRNA molecules without viral packaging signals. Furthermore, CCMV capsid proteins even assemble around negatively charged polyelectrolytes61,62, forming virus-like particles (VLPs). CCMV is thus a suitable ‘laboratory’ in which to study how non-specific interactions can drive viral assembly. It is possible to carry out ‘packaging competition’ experiments on CCMV to see whether different types of RNA molecules are packaged more readily than others27. As might be expected from packing considerations and considerations based on the electrostatics of capacitive charging and charge neutralization, there is an optimal size for the packaging of RNA molecules in a capsid of a given size. For the T = 3 wild-type CCMV particles, this optimal size is, for instance, ~3,200 nucleotides. Unexpectedly, the ‘alien’ RNA genome of brome mosaic virus (BMV) was packaged by CCMV capsid proteins with threefold higher efficiency than the RNA molecules of CCMV itself, even though the two RNA molecules have virtually identical lengths. This result suggested that the secondary structure of an ssRNA molecule influences in some manner the non-specific packaging efficiency. This conjecture was nicely confirmed by packaging competition experiments carried out on polyU and polyA RNA chains of the same length but with no secondary structure63. The negative charges of polyU and polyA closely compensate the positive charges of the capsid protein tails, in contrast with the wild-type CCMV which shows pronounced electrostatic overcharging64,65. The secondary structure of virus genomes is much more branched than that of randomized variants66. The degree of branching can be expressed quantitatively through the maximum separation between two nucleotides of the secondary structure66, known as the maximum ladder distance.

Obtaining information about the detailed sequence of events of virus assembly is an experimental challenge. Before turning to the methods that are being used, we first discuss the outcomes of numerical simulations of the assembly dynamics of simplified coarse-grained models, which act as an aid to interpreting the experiments. For instance, simulations of assembly of a dodecahedral capsid composed of 12 pentagonal pentamers67 that have an affinity for a linear genome molecule exhibit different assembly sequences depending on the capsid–genome interactions (Fig. 2b). The ‘antenna’ scenario is an orderly assembly sequence that resembles the nucleation-and-growth scenario of empty capsids. The genome molecule is attracted to the interior of the partially formed capsid. The part of the genome that is not yet packaged acts as an ‘antenna’ for the diffusive influx of capsomers from infinity (Fig. 2b, lower panel, and Box 3)68. By contrast, in the ‘en masse’ scenario, a disordered protein/genome condensate forms before the assembly of the particle. In effect, the condensate provides a local ‘milieu’ with high protein concentration that aids assembly. This scenario appears for increasing attraction strength between the capsid proteins and the genome (Fig. 2b, upper panel, and Box 3). Theoretical work on RNA-directed assembly of small viruses indicates that such a condensate can even be a feature of the equilibrium assembly phase diagram69.

Although bulk studies have difficulty distinguishing between these different assembly scenarios, a number of promising single-particle probes have been developed. One approach is based on interferometric scatter microscopy, in which interference between light scattered from a particle and reflection from a nearby surface provides information about the size of that particle. The size evolution then can be followed with high temporal resolution. This technique has been used to study the assembly of individual phage MS2 particles around surface-tethered ssRNA molecules (Fig. 2c)70. The tethering surface provides the reflected wave required for the technique, and the scattering intensities give the growth curves of individual particles. For a capsid protein dimer concentration of 1.5 µM, most scattering intensities approach a value consistent with that of a full capsid. In addition, a few reached significantly larger values. Recall that assembly intermediate states with an excess number of capsid proteins are also encountered for the assembly of empty HBV and CCMV capsids26. The lag time between assembly initiation and capsid growth has a strikingly stochastic character, which suggests that assembly initiation requires thermal activation over a free-energy barrier that is substantially larger than the thermal energy. Note that once assembly starts, it continues to completion with little disassembly. Conversely, for 4 µM concentrations, the lag time has little stochasticity, indicating a lower activation barrier. Most assemblies have excess protein material in this case. These results support the nucleation-and-growth picture with an activation energy barrier that decreases with increasing levels of supersaturation. Although no specific packaging signals were identified by these experiments, based on literature and the observed assembly pathway an assembly sequence in which packaging signals mediate assembly (Box 3) was proposed70,71.

In a different single-particle approach, a dsDNA strand was stretched between two beads in a double optical tweezers set-up72 and then exposed to a solution of fluorescently labelled synthetic polypeptide strands, which played the role of model capsid proteins. Different assembly stages of the resulting VLP were visualized in real time with millisecond temporal resolution (Fig. 2d). Smaller oligomers moved diffusively along the DNA strand during the initial stages of assembly, as expected in the antenna picture, but pentameric or larger oligomers were stably attached to the DNA. Fitting the observations to a kinetic version of Langmuir adsorption theory led to an estimation of the effective binding free energy of pentamers to the DNA of ~25kBT. To examine the elongation stage following nucleation, acoustic force spectroscopy (AFS) was used. The results indicated formation of a helical assembly with a repeat period of 30 nm, consistent with the observed packaging ratio. The stretched DNA strand functioned in this experiment as an antenna for catching polypeptides (Box 3). The same approach has been used to examine the assembly of VLPs of SV40 around dsDNA73. SV40 has a T = 7 icosahedral structure composed of pentameric capsomers, and packages both viral and heterologous dsDNA74,75. In addition, SV40 assembles around ss genomes, for which previously a two-step process was observed by SAXS76. A combination of optical tweezers, AFS and AFM was used to reveal a multi-step assembly mechanism around dsDNA. As in the previous example, the stretched DNA strand functioned as an antenna catching diffusing proteins, followed by formation of DNA-associated protein clusters. The interaction of capsid proteins with each other and with the DNA was sufficient to overcome the tension of the DNA strand, resulting in genome packaging, as revealed by a decrease in DNA contour length during assembly (Fig. 2e). For the future, such experiments open the prospect of quantitative determination of the assembly kinetics by measuring the time dependence of the forces exerted on two optical traps.

In these two examples, capsid assembly was initiated at a specific instant of time by a rapid change in solution conditions. An alternative method is to change solution conditions sufficiently slowly that the system remains near thermodynamic equilibrium during the assembly process. For the case of CCMV (Fig. 2f), the capsid proteins have positively charged, disordered tail groups with generic electrostatic affinity for the negatively charged RNA. By using BMV genome molecules in a solution with CCMV capsid proteins, all effects due to CCMV-specific packaging signals can be avoided69. The branched viral RNA molecules, which are highly charged and swollen when free in solution, reduce in size when CCMV capsid proteins are added to the solution (Fig. 2f, second panel). At high pH — that is, for weakened affinity between capsid proteins — disordered nucleoprotein condensates are formed, consistent with self-assembly theory69 and simulations of assembly67. The disordered condensates transform into ordered VLPs after reduction of the pH.

The protein-to-RNA concentration ratio is a critical parameter for the assembly ‘phase diagram’ of CCMV. This ratio must be carefully distinguished from the dependence of the assembly kinetics of MS2 on the absolute capsid protein concentration that was discussed earlier. The optimal mixing ratio is defined as the minimum value of the protein-to-RNA concentration ratio for which all of the RNA molecules are packaged. One would expect the optimal mixing ratio to equal the stochiometric protein-to-RNA ratio of CCMV virions, but in fact it was found to be considerably larger, corresponding to excess capsid proteins and reduced electrostatic overcharging. The final assembly step requires shedding of these excess proteins, as discussed above. The scenario of an excess of genome-bound proteins in the first phase of assembly fits well with the en masse pathway (Box 3) that SAXS revealed for the initial steps in CCMV assembly around its genome (Fig. 2g)77. Interestingly, these latter experiments also showed that final closure of the shell seems to occur through an activated process from a disordered condensate to an ordered closed particle.

Packaging signals

As discussed above, the assembly of ssRNA viruses is initiated and guided by genomic ssRNA molecules that have both a specific and a non-specific affinity for the capsid proteins. An important example of ssRNA viruses is HIV, which has a single packaging signal at the 5′ end of the viral RNA molecules, known as the Ψ sequence. It seems to act only on the nucleation step, not on the subsequent growth78,79 (Fig. 3A). This RNA sequence adopts an intricate structure with several high-affinity binding sites for the Gag proteinnucleocapsid domain80,81. If the Ψ sequence is removed, then cellular RNA material is packaged, driven by generic electrostatic affinity between RNA and the positively charged nucleocapsid domain of the Gag capsid protein82. The assembly thermodynamics of these VLPs is essentially the same as that of HIV virions, indicating that genome selection during HIV assembly is a purely kinetic process.

Fig. 3: Packaging signals.
figure 3

A | HIV packaging efficiency of RNA containing the packaging signal Ψ, compared with RNA without Ψ (control). B | Asymmetric structure of the genome of bacteriophage MS2. C | Schematic of MS2 assembly: A-protein (AP) attaches to RNA and to capsid protein dimers (CP2) (panel Ca); additional CP2 binds and capsid formation starts (panel Cb); CP2 is not only recruited by existing RNA stem-loops (SLs) but also triggers SL formation (panel Cc); further condensation of the RNA and closed-shell formation (panel Cd); formation of a stable virion (panel Ce). D | Hamiltonian path analysis. Reconstruction of RNA inside MS2 (panel Da); the RNA shell as polyhedral cage (panel Db); 3D view of the Hamiltonian path (panel Dc); planar representation of the same Hamiltonian path with quasi-equivalent MS2 capsid subunits (panel Dd). Part A adapted with permission from ref.79. Part B adapted with permission from ref.86. Part C adapted with permission from ref.86. Part D adapted with permission from ref.88.

Dynamical observations of assembly are challenging, but it may be possible to gather information about the assembly history of a virion from the structure of the encapsidated genome. Cryo-electron microscopy (cryo-EM) combined with icosahedral averaging allows for high-resolution structural studies of the genome density inside icosahedral capsids83,84. However, a consequence of the averaging approach is that only those parts of the genome density that possess icosahedral symmetry can be imaged. In most cases, only a small fraction of the genome of icosahedral ssRNA viruses displays icosahedral symmetry. An interesting exception is the family of the Nodaviridae, in which a notable amount of the ssRNA genome obeys icosahedral symmetry in the form of double-stranded sequences lining the capsid edges85. However, it has also become possible to visualize the genomes of individual viruses by asymmetric reconstruction, which has been performed for BMV7 and MS2 (Fig. 3B)86.

A large fraction of the MS2 genome is in the form of double-stranded sequences organized in a pattern that has little icosahedral symmetry. A number of the double-stranded sequences are stem-loops closely associated with protein dimers located along one half of the protein capsid. A natural explanation for this organization is that a specific sequence of stem-loops of the secondary structure of the genome molecule in solution associate with protein dimers during the early stages of assembly of the virion. Once half of the capsid has formed, the remainder ‘snaps together’ more rapidly, forcing additional condensation that transforms ssRNA sequences into double-stranded material86 (Fig. 3C). A study identified 80% of the MS2 genome material inside the virion6 and showed that specific stem-loops of the secondary structure of the MS2 genome are associated with specific capsid protein dimers. A geometrical method for reconstructing the MS2 assembly history has been proposed87,88 (Fig. 3D). It starts from the cryo-EM reconstruction of the RNA density, which is then represented as a cage of lines with icosahedral symmetry. A path is constructed on this cage that visits all vertices of the icosahedron, known as a Hamiltonian path89. Such a path no longer has icosahedral symmetry. Through a geometrical construction, the Hamiltonian path can be translated into an assembly sequence. This form of assembly, with a disperse set of packaging signals guiding the process90, should be contrasted with that of TMV, BMV and the retroviruses, in which packaging signals only initiate assembly. Indeed, reconstruction of the BMV genome indicates a much weaker correlation between capsid and genome structures7. It would be interesting to compare measurements of the kinetics of virion assembly with kinetic models for the assembly kinetics as discussed for the case of empty capsids, no such model is not yet available. To support the development of such a model, systematic experimental comparisons of the assembly kinetics of empty capsids with that of virions would be very useful.

Dynamics of closed shells

The initial assembly of a capsid often is only the first step in a shorter or longer process of dynamical development known as maturation. Not all viruses undergo maturation, but a wide variety of larger viruses do mature after assembly. Maturation involves conformational changes of the capsid proteins that either strengthen the capsid or prime the virion for genome release. It may have an evolutionary character, as exemplified by the sequence of conformational changes of the Escherichia coli phage HK97, or the character of a striking capsid metamorphosis, as exemplified by HIV-1. Apart from the irreversible dynamics of the maturation process, capsids in thermal equilibrium are also subject to reversible thermal shape fluctuations and soft modes.

Viral maturation

Our first example of maturation is the family of retroviruses, which, at least initially, have ssRNA genomes. Retroviruses assemble either inside the cytoplasm of an infected cell or on the cytoplasmic membrane (hereafter PM)91. In either case, they acquire a lipid envelope when they bud from the PM92. The resulting spherical immature particle is not yet infectious. This immature particle undergoes a conformational transformation induced by proteolytic cleavage of the Gag structural proteins of the capsid, resulting in the formation of a new, smaller, capsid surrounded by a spherical lipid–protein membrane93,94. The mature capsid itself may be cone-like, in the case of HIV-1, or spherical or sphero-cylindrical for other retroviruses. The mature virion is infectious. After its membrane proteins bind to host cell receptors — for instance, CD4 receptors of a T cell for HIV infection — the outer membrane fuses with the host cell membrane93,95,96. This fusion occurs either at the cell surface or in intracellular vesicles, and results in the release of the mature capsid into the cytoplasm of the cell.

Maturation is promoted and inhibited in various processes. Structural and dynamic studies using solution nuclear magnetic resonance (NMR) and solid-state NMR have shed light on proteolytic processing of Gag97 and the effect of maturation inhibitors98. Furthermore, cryo-electron tomography that included subtomogram averaging99 was used to study Gag regions that are targeted by maturation inhibitors100. Besides stimulating assembly of HIV particles, inositol phosphate — a compound that is present in all mammalian cells — also promotes maturation101. By shielding repulsive charges, it first aids structural rearrangements of Gag, and during maturation it stimulates hexamer formation.

Approaches focusing on the mechanical properties of retroviral capsids using AFM nanoindentation102,103,104,105,106 have further elucidated the physics of maturation. The retrovirus Moloney murine leukaemia virus undergoes a marked softening of the shell during maturation107, as expected in view of the proteolytic cleavage of the capsid proteins. A similar softening of the shell occurs for HIV-1108,109. A second marked change in the mechanical properties of the capsid takes place during infection. After entry into the new host cell, the very flexible ssRNA genome inside the capsid is transformed by the co-packaged reverse transcriptase enzyme into a much stiffer dsDNA genome. The necessary nucleotides for reverse transcription centre the capsid through dynamic, size-selective pores that are present in the centres of the hexamers110. The mechanical stress exerted by the newly formed enclosed dsDNA on the mature capsid causes it to rupture, allowing release of the genome into the cell. The ruptured capsids have been visualized both by AFM and electron microscopy111 (Fig. 4a). Under in vivo conditions, this rupture probably takes place after the mature capsid has docked on a nuclear pore complex. Softening and rupture of the capsid is a key for successful infection of the host cell by the viral particles: this reveals that there is a direct link between capsid mechanics and the infectivity of retroviruses112.

Fig. 4: Closed-shell dynamics.
figure 4

a | Atomic force microscopy (AFM) image of human immunodeficiency virus (HIV) capsid with opening (rectangle) due to reverse transcription. Inset: electron microscopy image of a different particle with an opening. Scale bars, 50 nm. b | Phages HK97 (top left) and λ (bottom left) strengthen the same parts of their icosahedral shells, as visualized by these reconstructions highlighting the differences between the two particles. AFM nanoindentation curves (right panels) show the difference in mechanical response of HK97 Prohead II and Head II. Error bars indicate standard error of the mean. c | Schematic of the effect of maturation in adenovirus. d | Cowpea chlorotic mottle virus (CCMV) nanoindentation in silico, showing the first two collective excitation modes (black arrows). Pentamers are shown in blue and hexamers in red. e | Phase diagram of capsid soft modes with the dimensionless temperature β−1 as a function of the Föppl–von Kármán number γ. Vertical blue bars indicate the regions of phase coexistence of solid and molten states. Part a adapted with permission from ref.111. Part b adapted with permission from refs115,118. Part c adapted with permission from ref.123. Part d adapted with permission from ref.139. Part e adapted with permission from ref.152.

Whereas the capsids of retroviruses soften during maturation, bacteriophages undergo a maturation transition that strengthens their capsids113. Recall that, on the one hand, the capsids of phage viruses must withstand very large osmotic pressures exerted by the enclosed dsDNA genome. On the other hand, during assembly the interactions between capsid proteins must be sufficiently weak to allow for the thermal annealing of kinetic traps. Phage viruses thus must transform their capsids from the initial, relatively unstable, procapsid into a very robust shell. As in the case of retroviruses, phage maturation involves a conformational rearrangement of the shell. For instance, the bacteriophage λ and HK97 virus initially assemble as ~55-nm-diameter empty shells that, through a sequence of stages, mature into ~65-nm-diameter genome-packed particles with a thinner shell. This maturation transition goes hand in hand with a strengthening of the capsid, as observed by AFM nanoindentation studies of phage λ114 and HK97115. For HK97, capsid maturation is composed of a sequence of coordinated, discrete steps ending with a covalent interlocking network of bonds that resembles the chainmail of medieval soldiers. The basic structural motif of the capsid protein of HK97 that governs the maturation steps is shared with λ and other bacteriophages such as P22, as well as the eukaryotic herpesviruses116. In fact, these viruses share genetic motifs117. In the initial procapsid state, the capsid protein can be represented as a compressed spring, which relaxes during the swelling and thinning of the capsid. Interestingly, although the HK97 and λ phages have different pathways to strengthen their shells (covalent crosslinking in the case of HK97 and addition of gpD protein in the case of the λ phage), the icosahedral symmetry sites on the capsid that are strengthened are the same118 (Fig. 4b). Furthermore, the increase in mechanical capsid strength during maturation (shown in Fig. 4b for HK97) is similar, even though the covalent crosslinking that occurs during HK97 maturation renders a particle that can be deformed to a larger extent than phage λ before failure occurs115. The maturation of bacteriophages P22 and T7 also leads to an increase in stability, as revealed by nanoindentation119,120.

Not every dsDNA virus undergoes the same maturation process. The maturation of the dsDNA adenovirus is markedly different from that of dsDNA bacteriophages. Maturation of adenovirus involves proteolytic cleavage of a variety of viral precursor proteins121. Comparison of immature and mature particles revealed a clear difference in terms of mechanical response and DNA decondensation. An increase in flexibility of the DNA during maturation — and hence an increase in osmotic pressure — seems to be an essential condition for effective genome release122,123. In concert with DNA decondensation, the pentameric structures of the capsid are also destabilized during maturation124,125, presumably assisting the egress of DNA (Fig. 4c). By the successive action of host-cell integrin molecules, which serve as secondary receptors during adenovirus docking, the pentameric structures become loose enough to be released into the endosome126,127. In other words, whereas dsDNA bacteriophages are strengthened during maturation, adenoviruses become more flexible during maturation, reminiscent of retrovirus maturation.

Soft modes and conformational dynamics

Structurally ordered capsids are reminiscent of tiny crystals. Although viral maturation is associated with complex biochemical processes, it is worth comparing the maturation steps discussed above with the physical phenomena that occur when the structure of a crystal changes as a function of some thermodynamic control parameter, such as temperature. An important feature of such structural phase transitions is the appearance of soft modes128. These are collective modes whose energy vanishes in the limit of long wavelengths and that can be viewed as precursors of the structural transition. Near the transition point, the stiffness of these modes softens substantially. As discussed below, very similar soft modes have been encountered for viral capsids. A Ginzburg–Landau theory that was developed to describe structural transitions and soft modes in solids can be adapted directly to structural transitions of viral shells129. Soft modes of viral capsids are generally associated with specific conformational changes of the capsid protein. However, according to thin-shell elasticity theory, any icosahedral shell should show mode softening at the buckling transition130,131. At a buckling transition, the shape of a shell with icosahedral symmetry changes from spherical to polyhedral. According to this theory, the buckling transition and the response of a shell to mechanical deformation should be determined by just a single dimensionless number, the Föppl–von Kármán (FvK) number γ, which is the ratio of the in-plane Young’s modulus of the shell and the out-of-plane bending modulus132. Soft modes associated with buckling have indeed been reported in molecular dynamics (MD) simulations of the capsid maturation of HK97133

The pioneering cryogenic X-ray and electron microscopy structural studies mentioned earlier reported that viral capsids were static crystallographic structures. However, more recent structural studies of capsids carried under room-temperature conditions found that, in fact, capsids are dynamical structures even when they are not undergoing maturation134,135. An interesting case study is the swelling transition of CCMV. The radius of a CCMV capsid increases substantially when the pH is raised from 5 to 7 (ref.136), whereas its bending stiffness, which can be probed by AFM nanoindentation132, shows a marked reduction. Such a reduction of the bending stiffness of a spherical surface indicates that the relaxation rates of the various spherical harmonic modes of the surface should be reduced137. Reduced relaxation rates, or ‘mode softening’, have indeed been observed for CCMV138. The highly dynamical nature of the CCMV capsid is illustrated at a more local level by the fact that protease enzymes are able to selectively digest the N-termini of the capsid proteins of CCMV capsids135. This is surprising because low-temperature X-ray diffraction and cryo-EM studies136 reported that these N-termini are located in the capsid interior even though the proteases are too large to centre the capsid. The implication is that capsids can be much more dynamic than indicated by low-temperature structural studies.

The effect of both in-plane and out-of-plane soft modes on the capsid thermodynamics of CCMV has been probed quantitatively through the coupling of numerical simulations to AFM nanoindentation data139 (Fig. 4d). The results of MD simulations can also be compared with studies by Raman spectroscopy140. The equilibrium dynamics of the capsid of HBV has been examined in considerable detail by MD, and the results were used to reinterpret cryo-EM imaging of HBV capsids141. MD in combination with solid-state NMR or AFM has been used to study the effect of host factor protein binding on HIV-1 assemblies, revealing that such factors could modulate capsid stability and thereby possibly infection142,143. One can also observe changes by hydrogen–deuterium exchange mass spectrometry, as for instance shown for MVM and dengue virus144,145; conformational fluctuations under equilibrium conditions of the closed MVM shells, termed ‘breathing’, have been observed144. Vibrational modes of solvated HIV-1 capsids were also studied by MD simulations, revealing the presence of breathing motions that simultaneously are directed radially as well as in the plane of the capsid surface146.

All-atom MD simulations of viruses147,148,149 — which have also been used to study the permeability of poliovirus capsids to water150 and the binding of ions to HIV-1 capsids146, for example — indicate that mutated capsids may collapse because of thermal fluctuations151. Relatedly, Brownian dynamics simulations of a coarse-grained representation of capsids revealed that they can melt, crumple or collapse under the action of thermal fluctuations, depending on the materials properties of the shell152 (Fig. 4e). The finite temperature phase behaviour is largely determined by the same FvK number that determines the response to mechanical deformation in the absence of thermal fluctuations. Viruses whose capsids lack structural order are known as pleiomorphic153, and some of these pleiomorphic viruses may well have capsids that are effectively in a liquid state154. A particularly interesting case is the family of archaeal viruses155. Certain archaeal viruses, whose capsids bear no resemblance to the classical rod and sphere viral geometries, can smoothly transform the Gaussian curvature of their capsids, which suggests that these capsids are in a liquid state153,154,155,156,157.

Conclusion

In this Review, we have discussed how recent developments in experimental and numerical techniques in the burgeoning field of physical virology have allowed a move from static descriptions of viral properties to descriptions of dynamical properties. With further refinement of the existing techniques, as well as the development of new techniques, in the coming years we expect large steps in terms of understanding the dynamical properties of viruses. Examples of potential new physical virology approaches in the experimental fields include developments in optical microscopy that allow for nanometre resolution imaging158, or in AFM, allowing the acquisition of millisecond to microsecond height spectroscopy data159 for real-time assembly studies43,160. Beyond using such techniques for an improved understanding of assembly and steady-state dynamics, it also would be interesting to perform assembly kinetics experiments in crowded environments, to mimic the actual situation in cells and to reveal whether depletion attraction and other crowding effects alter the results.

Numerical simulations of coarse-grained models of viral capsids and their genomes have played an important role in improving our physical understanding of the dynamics of viruses. All-atom MD simulations of small viruses are possible, and they provide important insights into the soft-mode dynamics and kinetic trapping of assembled viruses. Still, all-atom MD simulations of viral assembly (and disassembly) are not practical in the foreseeable future because of the prohibitively long timescales of viral assembly (seconds to hours). To the extent that this is due to the presence of large energy activation barriers, metadynamics methods161 could make such simulations tractable. Separately, the combination of MD with Markov state modelling could provide a systematic road to the development of coarse-grained models of viral assembly162. Additionally, the development of kinetic models for the assembly kinetics of genome-filled particles, as already available for the case of empty capsids23,24, will be useful. Apart from the general physical insights that this research generates, in the wake of the crisis caused by COVID-19 the importance of fundamental knowledge of viral properties has been stressed again163,164,165,166. Physical virology approaches will help to pave the way to in-depth knowledge of the mechanisms behind viral reproduction and infection; they also promise to help in developing applications of viruses and VLPs in nanotechnology and nanomedicine.