Main

Carboxysomes in cyanobacteria are proteinaceous microcompartments that enclose the photosynthetic key enzyme Rubisco (ribulose-1,5-bisphosphate carboxylase/oxygenase), together with carbonic anhydrase (CA), to generate high CO2 levels for carbon fixation1,2,3,4. Implementation of a carboxysome-like CO2-concentrating mechanism (CCM) in chloroplasts as a strategy for improving photosynthetic efficiency in crop plants requires a detailed understanding of carboxysome biogenesis5,6,7,8,9,10,11,12,13,14,15. Recent advances have shown that, early in this process, specialized scaffolding proteins initiate phase separation of Rubisco into a condensate for subsequent encapsulation16,17,18,19. However, the mechanisms underlying the sequestration of CA together with Rubisco are not yet understood.

Form I Rubisco, a complex of eight large (RbcL) and eight small (RbcS) subunits (Fig. 1a), evolved in an atmosphere rich in CO2. The drop in CO2 levels 500 million years ago generated the evolutionary pressure for the development of a CCM in photosynthetic bacteria20. There are two forms of carboxysome, α and β, which differ in their components and probably evolved independently2,21,22. Their proteinaceous shell allows the entry of dissolved CO2 in the form of HCO3, which is converted to CO2 by CA inside the carboxysome23 (Fig. 1b). The shell prevents CO2 from diffusing out24, resulting in the generation of high levels of CO2 in the vicinity of Rubisco20,22 and avoiding the competing side-reaction of Rubisco with oxygen (photorespiration)25.

Fig. 1: Condensate formation of M58 and CcaA.
figure 1

a, Structure of hexadecameric Rubisco in surface representation (PDB: 4RUB). The RbcL8 core consists of a tetramer of antiparallel RbcL subunits, shown in two shades of green. The RbcS subunits are shown in light green, four bound on top and four at the bottom of the RbcL8 core. b, CCM of β-cyanobacteria (see above for details). 3PGA, 3-phosphoglycerate; RuBP, ribulose-1,5-bisphosphate; CA, carbonic anhydrase. c, Domain structure of M58, M35 and CcaA from ß-cyanobacterial species Se7942. Amino acid numbers and molecular weights are indicated. The SSUL modules of M58 and M35 are numbered SSUL1–3 from the N to the C terminus. d,e, Salt dependence of condensate formation of CcaA and M58. Purified CcaA (0.5 μM) and M58 (0.5 μM) were incubated in buffer containing 50–400 mM KCl under either reducing (d) or oxidizing conditions (e) at 25 °C. Condensate formation was monitored by turbidity assay at 340 nm. Proteins alone were analyzed as control. Data are mean ± s.d. of triplicate measurements. f, Apparent binding affinity \((K_{\text D}^{\text{app}})\) of CcaA to M58. Turbidity was measured, as above at 0.5 μM, of either reduced M58 (M58red) or oxidized M58 (M58ox) with 0–1.25 μM CcaA. Absorbance values reached after 10 min are plotted. Data are mean ± s.d. of triplicate measurements. g,h, Condensate formation analyzed by fluorescence microscopy under reducing (g) and oxidizing (h) conditions. CcaA (0.25 μM) was mixed with 0.25 μM M58red or M58ox. M58 and CcaA were N-terminally labeled with Alexa532 and Alexa405, respectively (M58AF5; CcaAAF4), and used as 1:10 mixtures with unlabeled protein. Representative data of triplicate measurements are shown. df, Data on which the graphs are based are available as source data. A.U., arbitrary units.

Source data

β-Carboxysome biogenesis is reliant on the protein CcmM as a central organizing scaffold26,27,28. The full-length CcmM protein in Synechococcus elongatus PCC 7942 (Se7942) is a homotrimer of ~58-kDa subunits (SeM58). The N terminus of each M58 protomer consists of a γ-class carbonic anhydrase-like (γCAL) domain that has lost the functional motifs for CA activity27,29,30, followed by three Rubisco small subunit-like (SSUL) modules connected by flexible linkers (Fig. 1c). The γCAL domains mediate M58 trimer formation. A smaller isoform, SeM35 (~37 kDa), comprising only SSUL modules (Fig. 1c) is generated from an internal ribosome binding site31. In Se7942, CA activity is provided by a separate β-class CA protein, CcaA (~30 kDa)23,32 (Fig. 1c). Its deletion results in loss of CCM function, with cyanobacterial growth being dependent on high CO2 (5%)33,34. CcaA is redox regulated and active only in the oxidizing environment of the carboxysome35. It is recruited to carboxysomes by the γCAL domains of CcmM26,27 by an unknown mechanism.

We recently reported that the SSUL modules of SeM35 function to sequester Rubisco into a condensate16. This mode of condensate formation, through multivalent interactions of folded domains, differs from the use of intrinsically disordered, linear motifs as phase-separation scaffolds in α-carboxysomes and eukaryotic membraneless compartments17,36,37,38,39,40. Here we used a combined structural and biochemical approach to understand the interactions of SeM58 with CcaA and Rubisco. Our results reveal multiple, interwoven demixing and coassembly reactions with M58 functioning as a central organizer of the pre-carboxysome matrix. The γCAL domains of trimeric M58 recruit CcaA by binding the C-terminal peptide sequence of CcaA. Moreover, the high local concentration of SSUL modules in trimeric M58 provides enhanced avidity for Rubisco compared to M35. Additionally, SSUL modules engage in dynamic electrostatic interactions with γCAL domains. A head-to-head association of M58 trimers via their γCAL domains further increases local SSUL concentration. These interactions cooperate to facilitate the efficient multiprotein coassembly of CcmM (M58 and M35), CcaA and Rubisco for encapsulation into β-carboxysomes.

Results

CcmM–CcaA condensate formation

To investigate the interaction of CcmM and CcaA, we recombinantly expressed and purified CcmM (M58 and M35) and the CcaA of Se7942 (Extended Data Fig. 1a). Size-exclusion chromatography coupled to multiangle light scattering (SEC–MALS) confirmed that M58 is a trimer in solution while CcaA behaved as a tetramer (Extended Data Fig. 1b,c and Supplementary Table 1), consistent with βCAs functioning as dimers or higher-order oligomers26,41,42,43,44,45. We performed turbidity assays to monitor the interaction between M58 and CcaA. While no turbidity was detected for either protein alone (Fig. 1d,e), a strong turbidity signal was observed when M58 and CcaA were combined (Fig. 1d–f), consistent with condensate formation. Turbidity developed at a similar rate independent of the redox state of the M58 SSUL modules (t1/2 = ~0.6 min at 100 mM KCl) (Fig. 1d,e) with an apparent affinity \((K_{\text{D}}^{\text{app}})\) of CcaA for M58 of ~0.2 μM (Fig. 1f). The M58-CcaA interaction was impaired by high salt (Fig. 1d,e), indicating the involvement of electrostatic forces. Light microscopy of M58 or CcaA labeled N-terminally with fluorophores Alexa532 (M58AF5) and Alexa405 (CcaAAF4), and mixed 1:10 with the respective unlabeled protein, showed a diffuse distribution (Fig. 1g,h). When the two proteins were combined at equimolar concentration (0.25 μM), coassembly into fluorescent condensates with an average Feret’s diameter of ~1.0 μm was observed (Fig. 1g,h and Extended Data Fig. 2a). Fluorescence recovery after photobleaching (FRAP) experiments showed no recovery of fluorescence signal in either channel, indicating a strong association between M58 and CcaA (Extended Data Fig. 2b). No fusion of droplets was observed over a period of 20 min (Extended Data Fig. 2c and Supplementary Video 1), consistent with their low liquidity46 as indicated by FRAP.

In summary, condensate formation mediated by M58 provides a plausible mechanism for CcaA sequestration and recruitment into carboxysomes.

CcaA engages the γCAL domain of M58

The CcaA protein consists of an N-terminal βCA domain followed by a hydrophilic and intrinsically disordered C-terminal sequence of 60–70 residues2,23. This sequence contains two functionally important regions of ~15 residues, C1 and C2, separated by an unstructured, hydrophilic sequence of ~40–50 amino acids41,47 (Fig. 2a). C1 has been shown to be required for oligomerization and CcaA activity47, while the function of C2 remains unclear28,41. To understand the function of C2, we generated a CcaA mutant lacking the last 15 residues (CcaAΔC2) (Fig. 2a, Extended Data Fig. 1a and Supplementary Table 1). Notably, no turbidity was observed upon mixing CcaAΔC2 and M58 (Fig. 2b) and no fluorescent condensates formed (Extended Data Fig. 2d), suggesting that the C2 sequence mediates the interaction of CcaA with the γCAL domains of M58 (ref. 27).

Fig. 2: Structural requirements for M58–CcaA condensate formation.
figure 2

a,b, Condensate formation is dependent on the C-terminal C2 peptide sequence of CcaA. a, Domain structure of CcaA and CcaAΔC2, lacking the C-terminal 15 residues of CcaA. Amino acid numbers are indicated. b, Turbidity assays as in Fig. 1d,e at 100 mM KCl with 0.5 μM CcaA or CcaAΔC2 and 0.5 μM M58red or M58ox. Data are mean ± s.d. of triplicate measurements. c, Domain structure of M58, γCAL198 and M35 constructs. Amino acid numbers are indicated. The SSUL modules of M58 and M35 are numbered SSUL1–3 from the N to the C terminus. d, Condensate formation requires the γCAL domains and SSUL modules of M58. Turbidity assays with the combinations of proteins indicated were performed at 100 mM KCl using 0.75 μM γCAL198, 0.5 μM CcaA, 0.5 μM M58red/M58ox and 2.25 μM M35. Note that the relative concentrations of M58 and M35 were adjusted to maintain SSUL modules at a similar concentration. Data are mean ± s.d. of triplicate measurements. e, Condensate formation of γCAL198 and CcaA requires high protein concentrations and the presence of salt. Turbidity assays with the combinations of proteins indicated were performed at 50–200 mM KCl using 5.0 μM CcaA and 7.5 μM γCAL198. Data are mean ± s.d. of triplicate measurements. f, The C217 sequence of CcaA is sufficient for CcaA binding to γCAL198. Left: EGFPC2 constructs containing either the C-terminal 15 or 17 residues of CcaA fused to eGFP. Right: purified proteins (22.5 μM) were incubated with γCAL198 (7.5 μM) (100 mM KCl) for 15 min at 25 °C, followed by analysis of complex formation by native-PAGE and Coomassie staining. Representative data of triplicate experiments are shown. b,d,e, Data are available as Source data.

Source data

To confirm this interaction, we recombinantly expressed and purified the γCAL domain of M58 (residues 1–198) (γCAL198) (Fig. 2c, Extended Data Fig. 1a and Supplementary Table 1). As expected, CcaA did not interact with M35, which lacks the γCAL domain, as monitored by turbidity assay (Fig. 2c,d). CcaA at a concentration of 0.5 μM also failed to interact detectably with the trimeric γCAL198 (0.75 μM) (Fig. 2d). However, increasing the concentration of CcaA and γCAL198 by tenfold resulted in the development of turbidity with slow kinetics (t1/2 = ~5 min) (Fig. 2e), while no interaction was observed with M35 (Extended Data Fig. 2e). Interestingly, unlike the salt-sensitive M58-CcaA interaction (Fig. 1d,e), complex formation of CcaA with γCAL198 was somewhat enhanced at higher salt concentration (Fig. 2e), suggesting a contribution of hydrophobic forces. Fluorescence microscopy revealed small CcaA–γCAL condensates on a background of diffusely distributed proteins (Extended Data Fig. 2f).

To test whether the C2 region of CcaA was sufficient to mediate binding to γCAL198, we attached the 15-residue C2 sequence (LAPEQQQRIYRGNAS) to enhanced green fluorescent protein (EGFP) via a short flexible linker (GSGGS) (EGFPC215) (Fig. 2f). No complex formation of EGFPC215 with γCAL198 was detected following analysis by native polyacrylamide gel electrophoresis (PAGE) (Fig. 2f). However, we noticed the presence of a tryptophan residue just N-terminal of the C2 sequence, which would increase the hydrophobicity of the sequence. Indeed, a GFP construct containing the last 17 residues of CcaA (GWLAPEQQQRIYRGNAS) (EGFPC217) readily bound to γCAL198 (Fig. 2f).

In summary, the C-terminal C2 sequence of CcaA interacts with the γCAL domains of M58, an interaction critical for the recruitment of CcaA into the pre-carboxysome.

Structural basis of the CcaA C2-γCAL interaction

To identify the C2 binding site on γCAL, we analyzed the γCAL198–C217 complex by X-ray crystallography. However, crystals of the complex diffracted to only ~3.5-Å resolution, presumably due to the presence of a break in helix α2 of γCAL at residue 181, as suggested by crystal structures of the γCA domain of Thermosynechococcus elongatus BP-1 (PDB: 3KWC; PDB: 3KWD)29 (Extended Data Fig. 3a,b). Indeed, SeγCAL 1–181 (γCAL181) (Extended Data Fig. 1a) produced well-diffracting crystals, allowing structure solution by molecular replacement (PDB: 3KWC) at 1.67-Å resolution (Fig. 3a, Table 1 and Extended Data Fig. 3c). The overall structure of the γCAL181 protomer is highly similar to that of TeγCA (PDB: 3KWD) (Cα root mean square deviation (RMSD), 0.48 Å) (Extended Data Fig. 3d): it consists of an N-terminal, seven-turn, left-handed β-helix followed by a short helix α1, a long linker and part of helix α2, which packs along one face of the β-helix (Fig. 3a). The asymmetric unit of the crystal contained the protomer of γCAL181, while γCAL exists as a trimer in solution (Supplementary Table 1). Indeed, analysis using proteins, interfaces, structures and assemblies (PISA)48 indicated an extensive interface between subunits (4,180 Å2 buried at the interface from a total accessible surface of 18,860 Å2) (Extended Data Fig. 3e).

Fig. 3: Structure of the γCAL–C2 complex.
figure 3

a, Structural model of the Se7942 γCAL domain (residues 1–181) at 1.67-Å resolution, in ribbon representation. Secondary structural elements are indicated. b, Structural model of the γCAL181–C217 complex at 1.63-Å resolution. Left: γCAL domain (blue) is shown in transparent surface representation and the bound helical C217 peptide (pink) in ribbon representation. Right: model of the trimer of the γCAL181–C217 complex in transparent surface representation. c, Details of the interaction of C2 peptide with γCAL. Proteins are colored as in b and shown in ribbon representation, with critical amino acid residues in stick representation. Secondary structural elements and residue numbers are indicated. d, Point mutations of C217 peptide residues forming the interface with γCAL reduced or abolished the interaction between CcaA and M58. Turbidity assays with the combinations of proteins indicated were performed at 100 mM KCl using either 0.5 μM M58red/M58ox and 0.5 μM CcaA or the mutant proteins CcaA(W257A) or CcaA(R265D). Data are mean ± s.d. of triplicate measurements. d, Data are available as Source data.

Source data

Table 1 Data collection and refinement statistics

We also solved the structure of γCAL181 in complex with the C217 peptide of CcaA at a resolution of 1.63 Å (Fig. 3b, Table 1 and Extended Data Fig. 3f,g). The asymmetric unit of the crystal contained the protomer of γCAL181 with one peptide bound (Fig. 3b and Extended Data Fig. 3f). This indicates that all binding sites of the trimeric γCAL181 are occupied by peptide, providing the basis for multivalent network formation between M58 and CcaA (Fig. 3b). Analysis by isothermal titration calorimetry (ITC) using the monomeric EGFPC217 revealed a binding affinity (KD) of ~2 μM at a stoichiometry of ~2.7 per γCAL198 trimer (Extended Data Fig. 3h). The peptide is bound as a two-turn α-helix (residues PEQQQRIY) in a pocket beneath the protruding β10-β11 loop, making extensive interactions with residues in β11, β17, the β19-β20 loop and α1 of the γCAL protomer. The hydrophobic interactions include π-stacking between the indole ring of residue W257 in the C2 peptide with the benzene ring of F123 (β17) in γCAL (Fig. 3c). In addition, the hydrophobic residue Y267 of C2 contacts the γCAL residue L81 (β11) (Fig. 3c). Two electrostatic interactions (salt bridges) are formed by the guanidinium group of the peptide residue R265 with E143 (β20) and D163 (α1) of γCAL (Fig. 3c). The side chains of C2 residues Q262 and Q263 form hydrogen bonds with the backbone of D141 (β19-β20 loop) and the side chain of N124 (β17) in γCAL, respectively. The side chain of C2 residue Q262 also forms a hydrogen bond with the side chain of γCAL residue Q159 (α1). Moreover, the guanidinium group of R126 (β17) in γCAL forms hydrogen bonds with the backbone of I266 and Y267 in C2, and another hydrogen bond is formed between the backbone of C2 residue Y267 and L81 (β11) of γCAL (Fig. 3c). Notably, the binding site of C2 is highly conserved in the γCA/γCAL domains of CcmM proteins.

To validate the contribution of residues W257 and R265 to the C2–γCAL interface, we generated the mutants W257A and R265D in EGFPC217 and CcaA. Gel shift assays showed that both mutant proteins failed to form a complex with trimeric γCAL (Extended Data Fig. 3i). M58–CcaA condensate formation monitored by turbidity assay was reduced by ~50% for CcaA(W257A) and abolished for CcaA(R265D) (Fig. 3d). These results underscore the specific contribution of hydrophobic and charge interactions forming the CcaA–M58 network.

In summary, assuming the absence of steric hindrance, the γCAL trimer in the context of M58 may interact with two or three CcaA tetramers via their C2 sequences as the basis for condensate formation.

Contribution of SSUL modules to M58-CcaA interaction

So far, our analysis had shown that the interaction between M58 and CcaA involves two components: (1) a salt-sensitive component detected with full-length CcmM (M58) and CcaA, and (2) interaction between the γCAL domain and the C2 peptide of CcaA, which has a notable hydrophobic component. To understand the salt-sensitive component in more detail, we investigated the interaction of CcaA with C-terminal truncation mutants of M58 containing either two (γCAL-2S) or one (γCAL-1S) SSUL modules (Fig. 4a and Supplementary Table 1). Condensate formation with CcaA was only mildly impaired with γCAL-2S but was strongly reduced with γCAL-1S (Fig. 4a), indicating that SSUL modules provide additional, critical valency for condensate formation. Since no binding of CcaA to M35 was observed (Fig. 2d and Extended Data Fig. 2e), this raised the question of how SSUL modules contribute to CcaA–M58 complex formation. Might the SSUL modules mediate homo-oligomeric interactions between M58 trimers?

Fig. 4: Association between SSUL modules and γCAL domains of M58.
figure 4

a, Top: schematic representation of wild-type M58 and truncation mutants containing two or one SSUL modules (γCAL-2S and γCAL-1S, respectively). Residue numbers are indicated. Bottom: contribution of SSUL modules to M58–CcaA condensate formation. Turbidity was monitored over time after mixing CcaA (0.5 μM) with the reduced and oxidized wild-type M58 or truncated M58 constructs (0.5 μM) indicated at 100 mM KCl. Data are mean ± s.d. of triplicate measurements. b, Concentration dependence of M58 homocondensate formation. Turbidity assays were performed following dilution of M58ox from high salt (500 mM KCl) to reach final protein concentrations of 0.5–3.0 μM and a salt concentration of 50 mM KCl. M58red (5 and 6 μM) was also analyzed. Data are mean ± s.d. of triplicate measurements. c, M58 homocondensate formation is strongly enhanced under oxidizing conditions. M58ox and M58red (2.5 μM) homodemixing was analyzed by fluorescence microscopy as in Fig. 1g,h. M58 was N-terminally labeled with fluorophore Alexa532 (M58AF5) and used as a 1:10 mixture with unlabeled protein. Representative data of triplicate experiments are shown. d, M58ox homocondensate formation is mediated by the SSUL modules of M58. Turbidity assays were performed for reactions containing M58ox, γCAL-2Sox and γCAL-1Sox (3 μM each at 50 mM KCl). Data are mean ± s.d. of triplicate measurements. e, Mutation of charged residues in SSUL1, SSUL2 or SSUL3 inhibit or enhance M58ox homocondensate formation, measured by turbidity assay as in d. Mutations of negative to positive (D/E to K, blue triangles) enhance the interaction while mutations of positive to negative (R to D, red triangles) are inhibitory. Numbers of mutated residues are indicated. Representative data of two independent measurements are shown. f, Point mutations of charged residues in the γCAL domain reduce or enhance M58ox homocondensate formation, measured by turbidity assay as in e. Mutations of positive to negative (K/R to D, red triangles) enhanced the interaction and mutations of negative to positive (D/E to K, blue triangles) were inhibitory. Numbers of mutated residues are indicated. Representative data of two independent measurements are shown. g, Putative binding region of the SSUL module (yellow) mapped on the γCAL181–C2 protomer (PDB: 7O54). h, Head-to-head association of M58 trimers. The structural model of the dimer of γCAL trimers from the crystal structure was fitted into the cryo-EM density map from the M58ox condensate. The structure is shown in ribbon representation. Additional densities at the corners probably represent SSUL modules that associate in a dynamic manner. The cryo-EM density map was low-pass filtered to 5 Å to show these additional densities. i, Detail of the structural model from the crystal structure of SeγCAL181 showing the two salt bridges connecting protomers of opposing trimers. The side chains of residues R164 and D172 forming critical salt bridges are shown in stick representation. j, Head-to-head association of M58 trimers is required for efficient M58ox homocondensate formation. M58ox and mutants M58oxR164D and M58oxD172K, disrupting the salt bridges between M58 trimers, were analyzed by turbidity assay as in d at 3 μM final protein concentration (50 mM KCl). Data are mean ± s.d. of triplicate measurements. k, Head-to-head association of M58 trimers is not required for M58–CcaA condensate formation. M58 and mutants M58(R164D) and M58(D172K) (0.5 μM) were incubated with CcaA (0.5 μM) under reducing and oxidizing conditions, and condensate formation analyzed by turbidity assay at 100 mM KCl. Data are mean ± s.d. of triplicate measurements. a,b,df,j,k, Data are available as source data.

Source data

To address this possibility, we analyzed whether M58 can undergo condensate formation on its own. Reasoning that interactions involving SSUL modules might be more salt sensitive16, we conducted this analysis at a reduced salt concentration (50 mM KCl) and at a higher protein concentration compared to the experiments above (0.5 μM M58 in Fig. 1d,e). Interestingly, we observed homodemixing of both reduced and oxidized M58 (M58red and M58ox, respectively), with M58red requiring somewhat higher concentrations (Fig. 4b and Extended Data Fig. 4a). Fluorescence microscopy showed that condensate formation by M58ox was enhanced compared to M58red (Fig. 4c and Extended Data Fig. 4b). As in the case of the M58–CcaA condensate, there was no measurable recovery by FRAP (Extended Data Fig. 4c) and no droplet fusion (Extended Data Fig. 4d and Supplementary Video 2). Notably, mutation of the disulfide-forming cysteines in SSUL1 and SSUL2 to serine prohibited homodemixing (Extended Data Fig. 4e), consistent with the redox dependence of the process.

The redox dependence of M58 homodemixing suggested involvement of SSUL modules in mediation of homotypic interactions. Indeed, homodemixing proved to be highly salt sensitive (Extended Data Fig. 4f,g), which might explain the salt-sensitive component of the CcaA-M58 interaction. Indeed, while removal of one SSUL module from M58 (γCAL-2S) had only a minor effect, removal of two SSUL modules (γCAL-1S) completely abolished M58ox homodemixing (Fig. 4d), suggesting that SSUL modules play a role in mediation of the interaction between M58 trimers. Since M35 on its own did not undergo phase separation (Extended Data Fig. 2e), such interactions would have to be specific for trimeric M58. To test this, we engineered a M35 trimer by fusing a trimeric coiled-coil sequence49 to the M35 N terminus (CCTRIM35) (Extended Data Fig. 4h and Supplementary Table 1). However, no turbidity signal was detectable with CCTRIM35, even at high concentrations and low salt (Extended Data Fig. 4h), pointing to an interaction of SSUL modules with the γCAL domains in M58 and not between SSUL modules.

In summary, SSUL modules contribute to M58–CcaA condensate formation, apparently by mediation of salt-sensitive intermolecular interactions with the γCAL domains of neighboring M58 trimers, which allow M58 homodemixing.

Charge-charge interactions between SSUL and γCAL

Both the SSUL and γCAL domains expose multiple charged residues, which are characterized by a high degree of conservation (Extended Data Fig. 5a–d). Individual mutations of arginine residues 251, 252 or 298 in SSUL1, or R367 in SSUL2 or R481 in SSUL3 to aspartate essentially abolished homodemixing of M58ox (Fig. 4e and Extended Data Fig. 5e), indicating that all three SSUL modules contribute. Moreover, individual mutations of the conserved negatively charged residues E246, D249, D294 and E303 in SSUL1 to lysine resulted in enhanced turbidity, except for the mutant E286K, which behaved like wild type (Fig. 4e). These results suggested that a region of SSUL with several exposed positively charged residues would promote interaction with the γCAL domains of M58. One good candidate was the area containing residues R251, R252 and R254 (Extended Data Fig. 5b). Indeed, point mutation of the spatially close, negatively charged residue D249 to lysine enhanced M58 homodemixing (Fig. 4e and Extended Data Fig. 5b). Interestingly, this region of SSUL is also involved in the interaction of the SSUL modules with Rubisco16.

To identify the complementary surface of the γCAL domain that may interact with SSUL, we individually mutated the negatively charged γCAL residues E17, D21, D35, E76 or D112 to lysine, and the positively charged residues R37, R43, K62, R79, R95 or R126 to aspartate (Extended Data Fig. 5e). Among these residues, E17, D35, E76, R79, D112 and R126 are relatively conserved (Extended Data Fig. 5c). As expected, the effect of these charge mutations was reversed (Fig. 4f) compared to the mutations in SSUL (Fig. 4e)—converting positive charges to negative on γCAL enhanced M58ox homodemixing (except for mutant R43D) while changing negative charges to positive reduced condensate formation (except for mutant D21K) (Fig. 4f and Extended Data Fig. 5d). Note that the mutations D21K and R43D, which essentially preserved wild-type binding, are located at the edge of the putative surface for γCAL trimer formation and are also not highly conserved (Fig. 4f and Extended Data Fig. 5c,d). Mutational analysis suggests that negatively charged residues E17, D35 and E76, spatially located above the β10-β11 loop of the γCAL domain, form the intermolecular interaction site for SSUL modules (Extended Data Fig. 5d). Indeed, point mutation of the nearby positively charged residues, R37 and R79, to aspartate strongly promoted M58 homodemixing (Fig. 4f). Interestingly, it appears that the binding region on the γCAL domain for SSUL does not overlap with the site for binding the C2 peptide of CcaA, which is located below the β10-β11 loop (Fig. 4g). This is consistent with the observation that both SSUL modules and CcaA binding via the C2 peptide are required for efficient formation of the M58–CcaA condensate (Fig. 4a). Note that the trimeric state of M58 was maintained for all SSUL and γCAL domain mutants (Supplementary Table 1).

In summary, the intermolecular interaction between SSUL modules and the γCAL domains of M58 trimers involves a complex interplay of attractive and repulsive forces, consistent with single-charge reversal mutations having reducing or enhancing effects (Fig. 4e,f).

Head-to-head association of γCAL trimers

To analyze the structural basis of the intermolecular interactions of M58, we performed cryo-EM of M58ox. Reference-free, two-dimensional (2D) class averages revealed a class of barrel-shaped complexes with dimensions of ~5 × 8.2 nm2, consistent with two-stacked γCAL trimers in side view (Extended Data Fig. 6a,b). Notably, a head-to-head association of γCA trimeric domains is present in the asymmetric unit of the TeγCA crystal (PDB: 3KWC)29, and such an interaction is also observed in the molecular packing of our γCAL181 and γCAL181–C217 crystals. A three-dimensional (3D) classification without imposed symmetry resulted in an EM density map of ~3.6-Å resolution (Fig. 4h, Table 2 and Extended Data Fig. 6).

Table 2 Cryo-EM data collection and map resolution

In the cryo-EM density map there was substantial information loss in side views, due to a preferential end view orientation of the particles (Extended Data Fig. 6b). However, the three seven-turn β-helices and many bulky side chains were well resolved in the end view of the density map (Fig. 4h and Extended Data Fig. 6e), thus allowing docking of the stacked γCAL181 trimers from the crystallographic model (Extended Data Fig. 3e). Additional densities were seen to protrude from the edges of the complex above the β10-β11 loop (Fig. 4h), probably representing SSUL modules interacting either intra- or intermolecularly with the γCAL domains. Note that while SSUL2 and SSUL3 may function preferentially to form intermolecular contacts, our mutational analysis showed that all three SSUL modules participate in M58 homodemixing (Fig. 4e). The putative SSUL densities are smaller than the size of the SSUL module and are of low resolution, suggesting a dynamic interaction that precluded docking of bound SSUL. This dynamicity would allow SSUL modules to function in both M58 homodemixing and Rubisco sequestration.

Both γCAL181 and γCAL181–C217 crystals revealed the presence of two protomer–protomer salt bridges across the dimer interface formed by the conserved residues R164 and D172 (Fig. 4i). Thus, a total of six salt bridges stabilizes the dimer of γCAL trimers. Note that γCAL198 and γCAL181 are nevertheless trimeric in solution (Supplementary Table 1), suggesting that the head-to-head association occurs only at high protein concentrations within the condensate or crystal. To investigate the functional relevance of this interaction, we disrupted the salt bridges by either mutating R164 to aspartate or D172 to lysine. Strikingly, both R164D and D172K mutants strongly reduced homodemixing of M58 (Fig. 4j), indicating that dimer-of-trimer formation via the γCAL domains provides critical valency, presumably by increasing the local concentration of SSUL modules. Dimerization of hub proteins has been reported to increase valency in other condensate systems as well50. In contrast, disruption of the M58 head-to-head dimer did not affect the interaction of CcaA with M58 (Fig. 4k). Here, sufficient avidity is presumably maintained by the intermolecular M58 interactions mediated by SSUL binding to γCAL domains.

In summary, a cooperative network of fluctuating interactions ensures recruitment of CcaA into carboxysomes: (1) the extreme C-terminal sequence (C2) of CcaA binds the γCAL domain of M58, driven by a combination of hydrophobic and electrostatic interactions; (2) the SSUL modules of M58 engage the γCAL domains of adjacent M58 trimers via dynamic multivalent electrostatic interactions, with C2 and SSUL binding to distinct regions on γCAL; and (3) M58 undergoes homodemixing mediated by both intermolecular γCAL-SSUL interactions and a head-to-head association via γCAL trimers.

M58 binds Rubisco with high affinity

The SSUL modules of M35 have recently been shown to link the Rubisco holoenzyme16. To investigate how the trimeric M58 interacts with Rubisco, we first determined the apparent affinities of M58red and M58ox for Rubisco. M58 in both redox states displayed essentially identical apparent affinities \((K_{\text{D}}^{\text{app}})\) of ~0.07 μM for Rubisco at 50 mM KCl (Fig. 5a). This interaction was ~three- to tenfold stronger than that of M35 with Rubisco (\(K_{\text{D}}^{\text{app}}\) of ~0.2 and ~1 μM under reducing and oxidizing conditions, respectively)16, presumably due to the increased local concentration of SSUL modules in the M58 trimer. We confirmed this using the trimeric M35 construct CCTRIM35, which also showed high binding affinity for Rubisco (Fig. 5a). The trimeric γCAL198 alone, lacking SSUL modules, failed to interact with Rubisco (Fig. 5b,c). Moreover, the interaction of M58 with Rubisco proved to be salt resistant (up to 300 and 200 mM KCl for M58red and M58ox, respectively; Extended Data Fig. 7a,b), in contrast to the salt-sensitive interaction of M35 with Rubisco16 (Fig. 5b,c). Notably, CCTRIM35 mimicked the salt resistance observed with M58 (Extended Data Fig. 7c,d), further demonstrating that this property is due to the high local concentration of SSUL modules in the trimer.

Fig. 5: Trimeric M58 binds Rubisco with high affinity.
figure 5

a, CCTRIM35 mimics the high affinity of M58 for Rubisco, independent of redox state. Condensate formation of 0.25 μM Rubisco (RbcL8S8) with increasing concentrations of M58red, M58ox, CCTRIM35red and CCTRIM35ox was analyzed by turbidity at 50 mM KCl. Data are mean ± s.d. of triplicate measurements. b,c, M58–Rubisco condensate formation is mediated by the SSUL modules of M58, not by the γCAL domains. Condensate formation of 0.25 μM RbcL8S8 with M58 (0.25 μM), CCTRIM35 (0.25 μM) or M35 (0.75 μM) was analyzed by turbidity assay under reducing (b) and oxidizing conditions (c) at 50 and 100 mM KCl. Data are mean ± s.d. of triplicate measurements. d,e, Condensate formation of Rubisco with M58red (d) and absence of interaction of Rubisco with CcaA (e), as analyzed by fluorescence microscopy. M58, CcaA and Rubisco were N-terminally labeled with Alexa532, Alexa405 and Alexa647, respectively (M58AF5; CcaAAF4; RbcAF6), and used as 1:10 mixtures with unlabeled protein. Protein concentrations are indicated. Representative data of two independent experiments are shown. f, Time-lapse images of droplet fusion. A representative droplet fusion event of the M58red–Rubisco condensate (0.25 μM M58red/0.25 μM Rubisco) in the presence of 100 mM KCl (from Supplementary Video 3) is shown. M58red/AF5 fluorescence was detected. Scale bar, 2 μm. g, Mobility of M35 and M58 in condensates with Rubisco. Fluorescence recovery after FRAP experiments is shown for condensates formed by unlabeled Rubisco with either labeled M35red or M58red (0.5 μM Rubisco/2.0 μM M35red or 0.25 μM M58red; 100 mM KCl). Prebleach fluorescence is set to 1. Mean ± s.d. from n = 20 droplets. h, Mobility of Rubisco in condensates with M35 or M58. FRAP experiments are shown for condensates formed by labeled Rubisco with either unlabeled M35red or M58red. Concentrations as in g. Prebleach fluorescence is set to 1. Mean ± s.d. from n = 20 droplets. ac,g,h, Data are available as Source data.

Source data

Cryo-EM analysis confirmed that the SSUL modules of M58 bind Rubisco in a groove formed between two RbcL subunits and the adjacent RbcS16 (Table 2 and Extended Data Fig. 7e–h). Interestingly, unlike M35, M58 also bound the RbcL8 core complex of Rubisco, albeit with ~fourfold lower affinity than the holoenzyme (Extended Data Fig. 8a–c). As revealed by cryo-EM analysis, the SSUL module occupied the same binding site on RbcL8 as in the holoenzyme and did not use the RbcS binding region (Table 2 and Extended Data Fig. 8d–g), contrary to a recent suggestion51. Thus, the high local SSUL concentration on M58 can compensate for the missing interaction with RbcS16. However, the interaction of M58 with RbcL8 is unlikely to be important in vivo because Rubisco assembly factors, such as RbcX and/or Raf1 (refs. 52,53,54), bind the RbcL8 core and are displaced only after RbcS binding16. Indeed, the presence of Raf1 completely suppressed the binding of M58 to RbcL8 (Extended Data Fig. 8h), thus ensuring that only the Rubisco holoenzyme is recruited into carboxysomes.

Fluorescence microscopy of Rubisco (N-terminally labeled with Alexa647 and RubiscoAF6 and mixed 1:10 with the unlabeled protein) and M58 (M58AF5:M58, 1:10) at equimolar concentration resulted in colocalization of both proteins within droplet-shaped condensates (Fig. 5d), while no demixing was observed with Rubisco and CcaA (1:10 of CcaAAF4:CcaA) (Fig. 5e). M58–Rubisco droplets underwent fusion at a rate similar to that for M35–Rubisco condensates16 (Fig. 5f and Supplementary Videos 3 and 4). Interestingly, FRAP experiments on M58–Rubisco droplets nevertheless showed no recovery of fluorescence signal for either M58 (Fig. 5g) or Rubisco (Fig. 5h). Thus the M58-Rubisco interaction, while sufficiently fluctuating to allow droplet fusion, was rather strong. This contrasts with the M35–Rubisco condensate where M35 was relatively mobile (Fig. 5g) while Rubisco was immobile (Fig. 5h).

In summary, the high local concentration of SSUL modules in trimeric M58 increases the avidity for Rubisco and provides for a higher binding affinity and redox independence compared to monomeric M35. The binding site of SSUL on Rubisco remains the same for M58 and M35.

Efficient coassembly of M58, M35, CcaA and Rubisco

Carboxysome biogenesis requires the scaffolding proteins M58 and M35 to sequester Rubisco and CcaA in the reducing cytosol. We next analyzed whether the complex interactions described above allow coassembly of these proteins into a distinct condensate. As shown above, M58 undergoes condensate formation with CcaA and Rubisco while M35 forms a condensate only with Rubisco. We first investigated whether Rubisco, M58 and CcaA can coassemble. Assuming that CcaA is substoichiometric to M58 in the carboxysome26, we performed a sedimentation assay of the three proteins keeping the concentration of Rubisco and M58 constant (0.25 μM each) and varying the concentration of CcaA. At an equimolar ratio, all three proteins were recovered in the pellet fraction with only a small amount of CcaA remaining in the supernatant (Extended Data Fig. 9a), indicative of highly efficient sequestration. Using this condition, fluorescence microscopy demonstrated the colocalization of all three proteins (differentially labeled) into a phase-separated condensate (Fig. 6a).

Fig. 6: Coassembly of Rubisco, M58, M35 and CcaA.
figure 6

a, Coassembly into condensates of Rubisco, M58red and CcaA at the concentrations indicated (100 mM KCl). Proteins were labeled as in Fig. 5d,e. Representative data of two independent experiments are shown. b, Coassembly into condensates of fluorescence-labeled Rubisco, M35red, and M58red with unlabeled CcaA at the concentrations indicated (100 mM KCl). M35 was N-terminally labeled with Alexa405 (M35AF4). Representative data of two independent experiments are shown. c, Coassembly of fluorescence-labeled Rubisco, M58red and CcaA with unlabeled M35red at the concentrations indicated (100 mM KCl). Representative data of two independent experiments are shown. d, Coassembly of fluorescence-labeled Rubisco, CCTRIM35red and CcaA with unlabeled M35red. CCTRIM35 was N-terminally labeled with Alexa532 (CCTRIM35AF5). CcaA does not enter the condensate in the absence of M58. Representative data of two independent experiments are shown. e, Time-lapse images of droplet fusion. A representative droplet fusion event of four-protein condensates (0.5 μM Rubisco/2 μM M35red/0.25 μM M58red/0.25 μM), in the presence of 100 mM KCl, from Supplementary Video 5 is shown. M58red/AF5 fluorescence is detected. Scale bar, 2 μm. f, Mobility of M35 or M58 in condensates of Rubisco, CcaA, M58 and M35. FRAP experiments are shown for condensates formed by the four proteins. Either M35 or M58 was labeled (0.5 μM Rubisco/2 μM M35red/0.25 μM M58red/0.25 μM CcaA; 100 mM KCl). Prebleach fluorescence is set to 1. Mean ± s.d. from n = 20 droplets. g, Rubisco (RbcL8S8) carboxylation activity in condensates with M35, M58 or M35/M58/CcaA. The activity of RbcL8S8 is set to 100%, with RbcL8 as control. Individual data points and error bars represent mean ± s.d. of triplicate measurements. h, Versatility of M58-mediated protein interactions in multiprotein phase separation. Left: interaction of trimeric M58 (blue) with Rubisco (green) mediated by SSUL modules of M58 (yellow); middle: interaction of trimeric γCAL domains (blue) of M58 with the C2 sequence of tetrameric CcaA (orange); the C2 peptide is shown in pink; right: M58 homocondensate formation mediated by intermolecular interactions between SSUL modules and γCAL domains and by head-to-head association of γCAL trimers. Dotted lines indicate linker sequences between protein domains interacting with nearby molecules. Effects of increasing salt concentration and redox state of SSUL modules on interactions are indicated. Arrows down and up, reduced or enhanced interaction with increasing salt concentration, respectively. f,g, Data are available as Source data.

Source data

M35 is more abundant in carboxysomes than M58 (ref. 26). Following the addition of excess M35 (2 μM) to the coassembly reaction of Rubisco/M58/CcaA (0.25, 0.25 and 0.125 μM, respectively), all four proteins were recovered in the pellet fraction following sedimentation, with ~50% of M35 remaining in the supernatant (Extended Data Fig. 9b). Fluorescence microscopy using three fluorophores to label either Rubisco, M35 and M58, or Rubisco, M58 and CcaA showed that all four proteins efficiently colocalized (Fig. 6b,c). When M58 was replaced with CCTRIM35, CcaA no longer phase separated and was diffusely distributed (Fig. 6d). The average Feret’s diameter of the condensates varied from ~1.0 to ~2.5 μm depending on total protein concentration (Supplementary Table 2). Droplet fusion of the four-protein condensate occurred at only a very slow rate, indicating low fluidity (Fig. 6e and Supplementary Video 5). M35 mobility by FRAP was somewhat reduced in the condensate of the four proteins compared to the interaction of M35 and Rubisco, while M58 was immobile (Fig. 6f). Notably, the Rubisco enzyme was fully functional in carbon fixation within the condensates (Fig. 6g).

In summary, the scaffolding proteins M58 and M35, differing in binding affinity and dynamics, ensure the efficient sequestration of Rubisco and CcaA for copackaging into carboxysomes. M35 is the only component in the four-protein pre-carboxysome condensate that shows detectable mobility.

Discussion

β-Carboxysome biogenesis involves the sequestration of Rubisco together with the carbonic anhydrase CcaA, followed by shell formation55. Our biochemical and structural analysis elucidated how the scaffolding protein CcmM functions as a central organizer in recruiting Rubisco and CcaA into the pre-carboxysome core. CcmM orchestrates multiple, interwoven coassembly reactions via its γCAL and SSUL domains, resulting in the formation of an essentially immobile protein mesh. Once captured under reducing conditions, constituent carboxysome components cannot escape, facilitating efficient encapsulation. The low dynamics and slow fusion rate of the condensates may be relevant in limiting pre-carboxysome size before shell formation, because aberrantly large carboxysomes are less efficient in the CO2-concentrating mechanism16,56.

The ~21-kDa β-helical γCAL domain of M58 (CcmM) is remarkably versatile and participates in network formation through multiple cooperative interactions (Fig. 6h). As seen in the crystal structure, each protomer of the trimeric γCAL can bind the 17-residue C2 peptide of one protomer of a CcaA tetramer via specific hydrophobic and charge interactions. However, γCAL-C2 interactions alone are inefficient in mediation of M58–CcaA condensate formation, which requires additional charge interactions of SSUL modules with γCAL at a site distinct from the binding pocket of the C2 peptide of CcaA (Fig. 4g). The multivalency of the network is enhanced further by the ability of γCAL trimers to form head-to-head dimers (Fig. 6h). Similar interactions underlying condensate formation have been reported for other systems to result in relatively low dynamic assemblies40,50,57.

Recruitment of Rubisco for carboxysome biogenesis is solely mediated by the SSUL modules of M58 and M35, which bind in a groove at the interface between antiparallel RbcL dimers16. M58 has a substantially higher affinity for Rubisco than M35, due to the presence of nine SSUL modules per M58 trimer compared to only three in M35. The flexible linkers between SSUL modules apparently do not contribute directly to the interaction, but may play a role in balancing the entropic penalty of SSUL binding. The density, and thus avidity, of SSUL modules would be further increased by the γCAL-mediated head-to-head association of M58 trimers. What then is the role of M35, which is present in excess over M58 (ref. 26), and why are both proteins essential for carboxysome biogenesis31? We suggest that the differential redox regulation of M35 and M58 is important in converting the immobile pre-carboxysome condensate, required for initial capture of Rubisco and CcaA, into a more dynamic state in the oxidizing interior of the carboxysome. This redox regulation is mediated by disulfide bond formation in SSUL modules, which is critical for carboxysome biogenesis and CCM function in vivo16. Oxidation favors the interaction of SSUL modules with γCAL domains, thereby enhancing M58 homodemixing. Indeed, under oxidizing conditions, preformed M58ox condensates were maintained within more dilute and enlarged M58ox–Rubisco droplets (Extended Data Fig. 10). The restructuring of the pre-carboxysome condensate following oxidation may promote the formation of channels around the Rubisco lattice that can be navigated by other carboxysomal proteins and metabolites. This would also facilitate the metabolic repair of Rubisco by Rubisco activase, which possesses SSUL modules for recruitment into the pre-carboxysome matrix19.

In summary, our findings suggest the following model for pre-carboxysome formation in β-cyanobacteria: in the reducing cytosol, M58 cooperates with M35 to efficiently concentrate Rubisco and CcaA into an immobile matrix for subsequent encapsulation. Redox regulation of SSUL modules in the oxidizing carboxysome then favors homodemixing of M58 and renders the interaction of M35 with Rubisco more dynamic. This transition is required for CCM function.

Methods

Strains

Escherichia coli DH5α (ThermoFisher) cells were used for the amplification of plasmid DNA. Positive clones were selected and cultivated in lysogeny broth (LB) medium at 37 °C for 8 h. E. coli BL21 (DE3) (Agilent) was used for recombinant protein expression (see below).

The cyanobacterium S. elongatus PCC 7942 (Se7942) (Institut Pasteur Paris) was used to obtain genomic DNA of ccmM and ccaA. Se7942 was cultured in BG-11 medium at 30 °C and 50 r.p.m. under continuous light.

Plasmids

The oligos used for amplification and generation of plasmids are listed in Supplementary Table 3.

Genomic DNA

Se7942 was grown to high density and cells pelleted by centrifugation at 10,000g for 10 min. The cell pellet was resuspended in 100 ml of buffer (50 mM Tris-HCl pH 8.0/50 mM NaCl) and cells lysed by five cycles of heating (3 min at 95 °C) and snap-freezing in liquid nitrogen. The lysate was centrifuged (20,000g for 10 min) and 1 μl of supernatant was used as template in PCR reactions. The full-length ccmM gene was amplified using oligo nos. 1/2 and the ccaA gene using oligo nos. 67/68 (Supplementary Table 3).

Plasmids

The pHUE vector for His6-ubiquitin (H6Ub) fusion proteins58,59 was used to generate the plasmids used in this study. Plasmids were assembled by PCR and using the Gibson assembly cloning kit (NEB). The plasmids used in this study are listed in Supplementary Table 4 and are available upon request from the corresponding author.

pHUE-SeM58 was generated by amplification of the full-length ccmM gene from genomic DNA (Se7942) and subsequent cloning into the pHUE vector. The shorter constructs containing pHUE-SeγCAL-2S(1–429), pHUE-SeγCAL-1S(1–313), pHUE-SeγCAL(1–198) and pHUE-SeγCAL(10–181) were prepared by PCR from pHUE-SeM58 and cloned into the pHUE vector. Point mutations in pHUE-SeM58 were introduced by QuikChange mutagenesis (Agilent) to generate the following constructs: pHUE-SeM58-E17K; pHUE-SeM58-D21K; pHUE-SeM58-D35K; pHUE-SeM58-R37D; pHUE-SeM58-R43D; pHUE-SeM58-K62D; pHUE-SeM58-E76K; pHUE-SeM58-R79D; pHUE-SeM58-R95D; pHUE-SeM58-D112K; pHUE-SeM58-R126D; pHUE-SeM58-R164D; pHUE-SeM58-D172K; pHUE-SeM58-E246K; pHUE-SeM58-D249K; pHUE-SeM58-R251D; pHUE-SeM58-R252D; pHUE-SeM58-E286K; pHUE-SeM58-D294K; pHUE-SeM58-R298D; pHUE-SeM58-E303K; pHUE-SeM58-R367D; and pHUE-SeM58-R481D.

pHUE-SeM58-C4S (C261S/C279S/C377S/C395S) was generated by replacing the M35 fragment of pHUE-SeM58 with M35-C4S from the plasmid pHUE-Syn6301_ccmM_M35_C261S/ C279S/C377S/C395S16, by PCR and subsequent Gibson assembly.

To generate the construct pHUE-CCTRIM35, the trimeric coiled-coil sequence GEIAAIKQEIAAIKKEIAAIKQEIAAIKQGS49 was inserted into the plasmid pHUE-Syn6301_ccmM_M35 (ref. 16) between the C terminus of ubiquitin and the N terminus of M35, by PCR and subsequent Gibson assembly.

pHUE-SeCcaA was generated by amplification of the ccaA gene from genomic DNA (Se7942) and subsequent cloning into the pHUE vector. pHUE-SeCcaAΔC2 and pHUE-C217 were generated by cloning either the first 257 residues (1–257) or the last 17 (256–272) of SeCcaA from pHUE-SeCcaA into the pHUE vector, by PCR and subsequent Gibson assembly, respectively. Point mutations in pHUE-SeCcaA were introduced by QuikChange mutagenesis (Agilent) to generate the constructs pHUE-SeCcaA(W257A) and pHUE-SeCcaA(R265D).

pHUE-EGFP was generated by amplification of EGFP from pEF-gfp (Addgene) with GSGGS at the C terminus and subsequent cloning into pHUE. pHUE-EGFPC215 and pHUE-EGFPC217 were generated by replacing residues 1–257 or 1–255 of SeCcaA, respectively, in pHUE-SeCcaA with EGFP-GSGGS by PCR and subsequent Gibson assembly. Point mutations in pHUE-EGFPC217 were introduced by QuikChange mutagenesis (Agilent) to generate the constructs pHUE-EGFPC217(W257A) and pHUE-EGFPC217(R265D).

Protein expression and purification

The proteins SeRubisco52,60, SeRbcL8 (refs. 52,60), SeM35 (ref. 16) and SeRaf1 (ref. 54) were expressed and purified as previously described. Protein concentrations were determined spectrophotometrically at 280 nm.

M58 and mutants

M58 was expressed and purified from E. coli BL21 (DE3) cells harboring the pHUE-SeM58 plasmid. Briefly, cells were grown in LB medium at 37 °C with shaking at 130 r.p.m. to optical density (OD600) 0.4–0.5. Cells were equilibrated to 18 °C (~1 h) and protein expression induced by the addition of 0.2 mM isopropyl β-D-1-thiogalactopyranoside for 14 h/120 r.p.m. Cells were harvested and incubated in buffer A (50 mM Tris-HCl pH 8.0/500 mM NaCl/5% glycerol) containing 20 mM imidazole/1 g l–1 lysozyme/2.5 U ml–1/SmDNAse/complete protease inhibitor cocktail (Roche) for 1 h before lysis using EmulsiFlex C5 (Avestin, Inc.). After high-speed centrifugation (40,000g/40 min/4 °C), the supernatant was loaded on to a gravity Ni-NTA metal affinity column (Qiagen), equilibrated and washed with ten column volumes of buffer A/20 mM imidazole. The bound protein was eluted with buffer A/300 mM imidazole. The H6Ub moiety was cleaved using H6-Usp2 overnight at 4 °C59. The cleaved protein was buffer exchanged on a HiPrep 26/10 desalting column (GE) to buffer A. The protein eluate was then applied to a Ni-NTA column for removal of H6-Usp2, the cleaved H6Ub moiety and any uncleaved protein. Flowthrough was concentrated to ~3 ml and applied to a size-exclusion column (HiLoad 16/60 Superdex 200, GE) equilibrated in buffer B (50 mM Tris-HCl pH 8.0/500 mM KCl/10% glycerol). Protein-containing fractions were concentrated by ultrafiltration using Vivaspin MWCO 30000 (GE), aliquoted and flash-frozen in liquid N2. All M58 point mutant proteins were expressed in E. coli BL21 (DE3) cells harboring the respective plasmids and purified as described for wild-type M58.

M58-4C-S, γCAL-2S, γCAL-1S and CCTRIM35 were expressed in E. coli BL21 (DE3) cells harboring pHUE-SeM58-C4S(C261S/C279S/C377S/C395S), pHUE-SeγCAL-2S(1-429), pHUE-SeγCAL-1S(1-313) or pHUE-CCTRIM35, respectively. These proteins were expressed and purified as described for M58.

Reduced M58 proteins were generated by the addition of 5 mM DTT to purified proteins before use. To generate oxidized M58 proteins, purified proteins were incubated before use on ice with 2 mM H2O2 for 30 min, followed by buffer exchange on a PD MiniTrap G-10 column (GE) to buffer B to remove unreacted H2O2.

γCAL(1–198) and γCAL(1–181)

γCAL(1–198) and γCAL(1–181) were expressed in E. coli BL21 (DE3) cells harboring pHUE-SeγCAL(1–198) and pHUE-SeγCAL(1–181), respectively, essentially as described for M58 except that buffer A contained 150 mM NaCl and buffer B 150 mM KCl. After size-exclusion chromatography (HiLoad 16/60 Superdex 200, GE), purified proteins were concentrated by ultrafiltration using Vivaspin MWCO 10000 (GE), aliquoted and flash-frozen in liquid N2.

CcaA and CcaAΔC2

CcaA, CcaAΔC2, CcaA(W257A) and CcaA(R265D) were expressed in E. coli BL21 (DE3) cells from the plasmids pHUE-SeCcaA, pHUE-SeCcaAΔC2, pHUE-SeCcaA(W257A) or pHUE-SeCcaA(R265D), respectively, and purified as described for γCAL. Proteins were concentrated by ultrafiltration using Vivaspin MWCO 30000 (GE), aliquoted and flash-frozen in liquid N2.

EGFP, EGFPC215 and EGFPC217

E. coli BL21 (DE3) cells harboring pHUE-EGFP, pHUE-EGFPC215, pHUE-EGFPC217, pHUE- EGFPC217(W257A) or pHUE- EGFPC217(R265D) were used to express EGFP, EGFPC215, EGFPC217, EGFPC217(W257A) and EGFPC217(R265D), respectively, and purified as described for γCAL. After size-exclusion chromatography (HiLoad 16/60 Superdex 75, GE), the purified proteins were concentrated by ultrafiltration using Vivaspin MWCO 10000 (GE), aliquoted and flash-frozen in liquid N2.

γCAL and C217 peptide for crystallography

For crystallization trials, purified γCAL198 and γCAL181 were buffer exchanged to buffer C (20 mM Tris-HCl pH 8.0/150 mM NaCl) using a PD MiniTrap G-10 column (GE). The C217 peptide (residues 256–272) of SeCcaA was produced by expressing the vector pHUE-C217 containing H6Ub-tagged C217, with purification by Ni-NTA column as described for γCAL. The eluate from the Ni-NTA column was concentrated and applied onto a size-exclusion column (HiLoad 16/60 Superdex 75, GE) equilibrated in buffer C. Protein-containing fractions were collected and cleaved by H6-Usp2 overnight at 4 °C. The cleaved protein was applied to a Ni-NTA column for removal of H6-Usp2, the cleaved H6Ub moiety and any uncleaved protein. C217 peptide in the flowthrough was mixed with γCAL(1–198) or γCAL(1–181) at a molar ratio of 6:1, and incubated for 1 h at 4 °C to generate the complex. The respective complexes were purified by size-exclusion chromatography (HiLoad 16/60 Superdex 75 equilibrated in buffer C). Protein-containing fractions were concentrated by ultrafiltration using Vivaspin MWCO 3000 (GE), aliquoted and flash-frozen in liquid N2. The presence of the C217 peptide in the complex was confirmed by MS.

Turbidity assay

Measurements were performed at 25 °C in buffer (50 mM Tris-HCl pH 8.0, 10 mM Mg(OAc)2) containing different concentrations of KCl and in the presence or absence of 5 mM DTT, as indicated in figure legends. Reactions (100 μl) containing proteins as stated in figure legends were rapidly mixed by vortexing, and absorbance at 340 nm was monitored over time on a Jasco V-560 spectrophotometer set to 25 °C. Generally, proteins from two independent purification batches were analyzed repeatedly. Data were plotted using Origin 2020.

Rubisco activity assay

Rubisco activity assay were performed essentially as described previously54,61. Reactions (50 μl) were performed at 25 °C in buffer (50 mM Tris-HCl pH 8.0, 100 mM KCl) containing Rubisco (RbcL8S8, 0.5 μM) and M35 (2 μM) or M58 (0.25 μM), or Rubisco and M35 (2 μM)/M58 (0.25 μM)/CcaA (0.25 μM), and were incubated for 10 min in the presence or absence of 5 mM DTT to allow condensate formation. Rubisco-active sites were activated by the addition of 20 μl of premix (50 mM Tris-HCl pH 8.0, 100 mM KCl, 150 mM MgCl2, 250 mM NaHCO3, 4.5 mM NaH14CO3 (specific activity 56.6 mCi mmol–1)) and reactions incubated for a further 10 min. The carboxylation reaction was initiated by the addition of 30 μl of ribulose-1,5-bisphosphate (10 mM) and stopped by the addition of 20 μl of formic acid (18 M) after 5 min. The amount of carbon fixed was quantified using a HITACHI AccuFLEX LSC-8000 scintillation counter. The activity of RbcL8S8 is set to 100% and that of RbcL8 was measured as control. Proteins from the same purification batches were analyzed repeatedly. Data were plotted using Origin 2020.

Condensate formation analysis by fluorescence microscopy

Proteins to be analyzed for phase separation by microscopy were fluorescently labeled at the N terminus. Rubisco holoenzyme was labeled with Alexa Fluor 647 NHS ester (RbcLSAF6, ThermoFisher) according to the manufacturer’s instructions (~4.6 dye molecules bound per Rubisco holoenzyme). M58ox, CCTRIM35ox and γCAL(1–198) were labeled with the fluorophore Alexa Fluor 532 NHS ester (ThermoFisher) (M58AF5; CCTRIM35AF5; γCALAF5: ~2.0, ~2.1 and ~1.4 dye molecules bound per M58ox trimer, CCTRIM35ox trimer and γCAL(1–198) trimer, respectively), while M35ox, CcaA and CcaAΔC2 were labeled with the fluorophore Alexa Fluor 405 NHS ester (ThermoFisher) (M35AF4; CcaAAF4; CcaAΔC2AF4: ~0.8, ~1.2 and ~3.0 dye molecules bound per M35ox, CcaA tetramer and CcaAΔC2 tetramer, respectively). Labeled protein was mixed with unlabeled protein at a ratio of 1:10. Reactions (20 μl in 50 mM Tris pH 8.0/10 mM Mg(OAc)2) with 50 or 100 mM KCl performed in the presence or absence of 5 mM DTT, and proteins at the concentrations stated in figure legends, were combined. After incubation for 5 min at 25 °C, reactions were transferred to an uncoated chambered coverslip (μ-Slide angiogenesis, Ibidi) followed by incubation for a further 5 min before analysis. For the analysis of droplet fusion, reactions (20 μl) contained 10% labeled M58red/AF5 or M35red/AF5, with unlabeled Rubisco and other proteins as indicated in the figure legends. After preparing each reaction in a low-binding microcentrifuge tube with protein concentrations as stated in the figure legends, the reaction was transferred to an uncoated chambered coverslip (μ-Slide angiogenesis; Ibidi) without further incubation and videos were recorded in a single focal plane at 5-s time intervals for 20 min. Samples were illuminated with a Lumencor SPECTRA X Light Engine at 398, 558 and 640 nm for fluorescence imaging. Images were recorded by focusing on the bottom of the plate using a Leica Thunder Widefield 2 microscope with a Leica DFC9000 GTC camera and a HC PL APO ×63/1.47 numerical aperture oil objective, using Leica Application Suite X software. Generally, proteins from two independent purification batches were analyzed repeatedly.

The software Fiji62 was used for analysis of size distribution of droplets. In brief, after preprocessing of images with background subtraction and Gaussian blur, the MaxEntropy method was applied to determine the threshold of segmentation.

Fluorescence recovery following photobleaching

FRAP experiments were carried out with a Leica TCS SP8 AOBS confocal laser scanning microscope (HCX PL APO 63×/1.2 water objective, PMT detector). Rubisco holoenzyme was labeled with Alexa Fluor 532 NHS ester (RubiscoAF5, ThermoFisher) (~3.6 dye molecules bound per Rubisco holoenzyme), and other proteins were labeled as described above. Reactions (20 μl) in buffer D (50 mM Tris pH 8.0/100 mM KCl/10 mM Mg(OAc)2) in the presence or absence of 5 mM DTT, and proteins at the concentrations stated in figure legends, were combined. After incubation for 5 min at 25 °C, reactions were transferred to an uncoated chambered coverslip (μ-Slide angiogenesis, Ibidi) followed by incubation for a further 15 min before analysis. Images before and 10 min after photobleaching were recorded in a single focal plane at 5-s time intervals. Bleaching was performed with a bleach point model using either a 405-nm diode laser at 2% intensity or a 532-nm argon laser at 100% intensity in one repeat, with a dwell time of 100 ms. The software Fiji was used for image analysis62. Proteins from the same purification batches were analyzed repeatedly.

Size-exclusion chromatography coupled to SEC–MALS

Purified proteins (2 mg ml–1) were analyzed using static and dynamic light scattering by autoinjection of the sample onto a SEC column (5 μm, 4.6 × 300 mm2 column; Wyatt Technology, no. WTC-030N5) at a flow rate of 0.20 or 0.25 ml min–1 in buffer (50 mM Tris-HCl pH 8.0/150 mM KCl or 50 mM Tris-HCl pH 8.0/500 mM KCl/5 mM DTT) at 25 °C. The column was in line with the following detectors: a variable ultraviolet absorbance detector set at 280 nm (Agilent 1100 series), a DAWN EOS MALS detector (Wyatt Technology, 690-nm laser) and an Optilab rEX refractive index detector (Wyatt Technology, 690-nm laser)63. Molecular masses were calculated using ASTRA 5 software (Wyatt Technology) with the dn/dc value set to 0.185 ml g–1. Bovine serum albumin (ThermoFisher) was used as the calibration standard. The graphs shown in Extended Data Fig. 1b,c were generated using SigmaPlot 14.

ITC

ITC measurements were carried out on a ITC200 calorimeter (Microcal, GE) at 20 °C. After dialysis into buffer D, EGFPC217 (365 μM) was loaded into the syringe and titrated into the sample cell containing γCAL(1–198) (14 μM). The reference cell contained buffer D. For each titration point, 10 μl of EGFPC217 was injected at time intervals of 3 min. Titration data were analyzed using the software Origin 2020 and fitted with a one-site binding model. Proteins from the same purification batches were analyzed twice.

Crystallization and data collection

Crystals of γCAL(1–181) and γCAL(1–181)-C217 were grown by the sitting-drop vapor diffusion method at 20 °C. Drops containing 0.6 μl of a 1:1 mixture of 10 mg ml–1 protein in buffer C and precipitant (0.1 M HEPES pH 7.5/25% PEG-3350) were equilibrated against 100 μl of precipitant. For cryomounting, crystals were transferred into cryo-buffer (0.1 M HEPES pH 7.5/25% PEG-3350/10% glycerol) and subsequently cryocooled by dipping into liquid N2.

Crystallographic data collection, structure solution and refinement

The diffraction data for γCAL(1–181) and γCAL(1–181)-C217 crystals were collected at beamline ID23-2 using MXCuBE3 and a wavelength of 0.87313 Å at the European Synchrotron Radiation Facility (ESRF) in Grenoble, France, and beamline X06SA using the SSX suite and a wavelength of 0.99989 Å at the Swiss Synchrotron Light Source (SLS) in Villigen, Switzerland, respectively, with crystals maintained at 100 K and processed with autoPROC (Global Phasing)64 using XDS65, POINTLESS66 and AIMLESS67.

The structure of γCAL(1–181) was solved to 1.67-Å resolution by molecular replacement using the program MOLREP68 with the γCAL(1–209) domain of T. elongatus BP-1 (PDB: 3KWC) as a search template. The asymmetric unit contained one γCAL(1–181) protomer with residues 1–15 disordered. The model was edited manually using Coot69 as implemented in the CCP4i graphical user interface70. REFMAC5 was used for model refinement71. The model of γCAL(1–181) contains 127 ordered water molecules, a presumably ordered Cl ion and a Ni2+ atom from the Ni-NTA metal affinity column. The bound Ni2+ atom was identified by X-ray fluorescence scanning.

The structure of γCAL(1–181)-C217 was solved to 1.63-Å resolution by molecular replacement using the γCAL(1–181) model and refined as described above. The asymmetric unit contained one γCAL(1–181) protomer with residues 1–15 disordered, one bound C217 peptide with residues 256–270 resolved (Extended Data Fig. 3g), one presumably ordered Cl ion, one Ni2+ ion and 103 ordered water molecules.

Structure analysis

The quality of the structural models was analyzed with the program Molprobity72. The final models of γCAL(1–181) and γCAL(1–181)-C217 exhibited reasonable stereochemistry with 98.2 and 98.3% of residues, respectively, in the favored regions of the Ramachandran plot and no residues in outlier regions. Coordinates were aligned with Lsqkab and Lsqman73. Molecular interfaces were analyzed with PISA48 and figures were created with PyMol (http://www.pymol.org/).

Cryoelectron microscopy and reconstruction

Sample preparation and data collection

All cryogrids were prepared with a Vitrobot Mark 4 (FEI). A sample volume of 3 μl was applied to a glow-discharged grid (Quantifoil R2/1 300-mesh) at 25 °C and 90% humidity, then semiautomatically blotted and plunge-frozen into liquid ethane.

For the analysis of M58 head-to-head complexes, 6 μM M58ox was incubated in buffer (50 mM Tris-HCl pH 8.0, 10 mM Mg(OAc)2, 50 mM KCl) at 25 °C for 10 min and cryogrids prepared as described above. Eight cryogrids were initially screened on a Glacios transmission electron microscope (ThermoFisher) equipped with a K2 summit direct electron detector (Gatan), operated at 200 keV. Selected grids were transferred to a Titan Krios 300-kV TEM (FEI) equipped with GIF Quantum Energy Filters (Gatan) and a K3 direct detector (Gatan). A total of 1,836 videos were automatically collected by SerialEM74 using a pixel size of 0.4114 Å. The total exposure time of 1.2 s was divided into 20 frames with an accumulated dose of 60 electrons Å–2 and a defocus range of –0.65 to –2.15 μm.

The complexes M58red–Rubisco and M58red–RbcL8 were prepared by mixing Rubisco (6 μM) and M58red (8 μM), or RbcL8 (6.25 μM) and M58red (16.7 μM) in buffer (50 mM Tris-HCl pH 8.0, 10 mM Mg(OAc)2, 50 mM KCl, 5 mM DTT), respectively, for 10 min at 25 °C and cryogrids prepared as described above. The cryogrids were screened on a Glacios transmission electron microscope (ThermoFisher) equipped with a K2 summit direct electron detector (Gatan), operated at 200 keV. The selected grid on stage was used for data collection directly with K2 summit. Exposure times of 12 s were divided into 40 frames with an accumulated dose of 45 electrons Å–2. In total, 976 and 1,027 videos were automatically collected for M58red–Rubisco and M58red–RbcL8, respectively, by SerialEM74 with a pixel size of 1.885 Å and a defocus range of –0.7 to –4.5 μm.

Image processing

M58

On-the-fly processing during data collection was performed with MotionCor2 (ref. 75) and CTFFIND-4.1 (ref. 76), as implemented in Focus software77. Only micrographs with good signal quality and with an estimated maximum resolution <5 Å were kept for further data processing with RELION 3.1 (ref. 78) (Extended Data Fig. 6). A total of 349,391 particles were autopicked by Gautomatch (https://www2.mrc-lmb.cam.ac.uk/research/locally-developed-software/zhang-software/) and extracted at a pixel size of 1.65 Å (fourfold binned). The first round of 2D classification was performed to exclude ice contaminants and classes with no structural features. Because the side-view classes suggested a head-to-head stack of two γCAL domains, the crystallographic model of the γCAL dimer-of-trimers (PDB: 7O4Z) was converted to an EM density map in mrc format with a low-pass filter to 15 Å, which was used as a reference for 3D classification and refinement. The selected particles were subjected to one round of refinement, and new particles were extracted with refined coordinates (recenter) at a pixel size of 0.82 Å. Three-dimensional classification resulted in one major class containing 128,330 particles (Extended Data Fig. 6c), which were subjected to contrast transfer function (CTF) refinement and polishing to generate the final map at 3.57-Å resolution, determined by gold-standard Fourier shell correlation (FSC) with a cutoff at 0.143 (Extended Data Fig. 6d). The particle distribution plot suggested a lack of information in side view (Extended Data Fig. 6b) and, as a result, the side view of the reconstruction was stretched. Based on the information from the well-resolved end view (Extended Data Fig. 6e), we docked the crystallographic model of γCAL trimers into the EM density map using Chimera79. The EM density map is deposited with EMDB under the accession code EMD-12730.

M58red–Rubisco

The raw videos of the dataset for the M58red–Rubisco complex were first processed with MotionCor2 (ref. 75) with dose-weighting. CTFFIND-4.1 (ref. 76) estimated the CTF parameters for each micrograph. A total of 620,012 particles were picked by Gautomatch (https://www2.mrc-lmb.cam.ac.uk/research/locally-developed-software/zhang-software/) (Extended Data Fig. 7e–h). Particles were first extracted at a pixel size of 7.54 Å (fourfold binned). One round of 2D classification resulted in 507,604 clean particles, with ice contaminants and classes with no structural features excluded (Extended Data Fig. 7f,g). These particles were refined with a low-resolution reference converted from the crystal structure coordinates of the Rubisco holoenzyme (PDB: 1RBL), and extracted at a pixel size of 3.77 Å. A single round of 3D classification with D4 symmetry resulted in four classes with no major differences. Thus, particles from all four classes were subjected to further analysis. These particles were again extracted with full resolution at a pixel size of 1.885 Å. We next followed the same symmetry-expansion procedure previously published16—that is, particles were first aligned with D4 symmetry to account for multiple SSUL modules bound per Rubisco. Each asymmetric unit was processed as an individual particle, which is achieved by the symmetry-expanding command, relion_particle_symmetry_expand, and particle subtraction was done based on a mask covering two RbcL, two RbcS and two SSUL. A focused classification with a SSUL mask resulted in one class of particles with detailed SSUL features. A total of 698,820 particles from this class were selected and subjected to final local refinement, and the postprocessing job pushed the resolution of the EM density map to ~4 Å as determined by gold-standard FSC curve at 0.143 cutoff (Extended Data Fig. 7g,h). Two EM density maps were deposited with EMDB under the accession code EMD-12731, one without sharpening applied and the other sharpened with DeepEMhancer (https://doi.org/10.1101/2020.06.12.148296).

M58red–RbcL8

The raw videos of the dataset for the M58red–RbcL8 complex were first processed with MotionCor2 (ref. 75) with dose-weighting. CTFFIND-4.1 (ref. 76) and the CTF parameters for each micrograph were estimated. In total, 258,285 particles were picked by Gautomatch (http://www.mrc-lmb.cam.ac.uk/kzhang/Gautomatch). Particles were first extracted at a pixel size of 7.54 Å (fourfold binned). One round of 2D classification excluded ice contaminations and classes with no structural features, resulting in 136,505 clean particles (Extended Data Fig. 8d–f). These particles were refined with a low-resolution reference converted from the RbcL8 crystal structure coordinates (PDB: 1RBL with RbcS subunits deleted), and extracted at a pixel size of 3.77 Å. A single round of 3D classification identified a major class with detailed RbcL8 features (92,899 particles) (Extended Data Fig. 8f). These particles were again extracted with full resolution at a pixel size of 1.885 Å. We next followed the same symmetry-expansion used for image processing of M58red–Rubisco (see above). Focused classification with a SSUL mask resulted in a single class (193,877 particles) with detailed SSUL features (Extended Data Fig. 8f). This class was subjected to final local refinement yielding a map at ~8-Å resolution without postprocessing (Extended Data Fig. 8f). To exclude the bias due to focused classification on SSUL, we performed another round of focused classification with a mask covering one RbcS subunit. Classification resulted in one dominant class containing no EM density in the region where RbcS is bound. The previously published model of RbcL2-RbcS2-SSUL (PDB: 6HBC)16 was fitted into the experimental EM density map using Chimera79. Two maps were deposited with EMDB under accession code EMD-12732, one without sharpening applied and the other sharpened with DeepEMhancer (https://doi.org/10.1101/2020.06.12.148296). The resolution was determined by a gold-standard FSC curve at 0.143 cutoff (Extended Data Fig. 8g).

Sequence alignment

Conservation of protein sequences was analyzed using the ConSurf web server80. Searching of sequences homologous to SeSSUL (219–311) or SeCcmM (1–539) was performed against the UniProt database. HMMER81, three and 0.0001 were set for homolog search algorithm, number of iterations and E-value cutoff, respectively. Multiple sequence alignment containing 500 SSUL or 150 γCAL homologous sequences was built using MAFFT82 and submitted to the WebLogo server83 to create the sequence logos.

Statistics

All relevant biochemical experiments were replicated two or three times. No statistical methods were used to predetermine sample size, but our sample sizes are similar to those reported in previous publications16,19. For cryo-EM, data were screened on eight independently prepared samples.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.