Main

The irreplaceable roles of rare-earth (RE) elements in ubiquitous modern technologies ranging from permanent magnets to light-emitting diodes and phosphors have renewed interest in one of the grand challenges of separation science—efficient separation of lanthanides1. The separation of these 15 elements is complicated by the similar physicochemical properties of their predominating +III ions, with ionic radii decreasing only 0.19 Å between LaIII and LuIII (ref. 7), which also leads to these metals co-occurring in RE-bearing minerals. Conventional hydrometallurgical liquid–liquid extraction methods for RE production utilize organic solvents such as kerosene and toxic phosphonate extractants and require dozens or even hundreds of stages to achieve high-purity individual RE oxides3,8. The inefficiency and large environmental impact of RE separations9 have stimulated research efforts into alternative ligands with larger separation factors between adjacent REs10,11,12,13,14, and greener process designs to achieve RE separation in fewer stages15 and using all-aqueous chemistry6,16,17,18,19,20.

The discovery of the founding member of the LanM family of lanthanide-binding proteins demonstrated that nature has evolved macromolecules surpassing the selectivity of synthetic f-element chelators4. The prototypal LanM, from M. extorquens AM1 (Mex-LanM), is a small (12-kDa), monomeric protein that undergoes a selective conformational response to picomolar concentrations of lanthanides4,18 and actinides21,22,23,24, has facilitated understanding of lanthanide uptake in methylotrophs25, and has served as a technology platform for f-element detection26, recovery18,27 and separation6. Unusually among RE chelators, Mex-LanM favours the larger and more abundant light REs (LREs), especially LaIII–SmIII, over heavy REs (HREs)4. Our recent demonstration that even single substitutions to the metal-binding motifs of Mex-LanM can improve actinide/lanthanide separations23 spurred us to investigate whether orthologues of Mex-LanM might possess distinct, and potentially useful, metal selectivity trends.

Herein, we report that the LanM from Hansschlegelia quercus (Hans-LanM), a methylotrophic bacterium isolated from English oak buds28, exhibits enhanced RE separation capacity relative to Mex-LanM. Whereas Mex-LanM is always monomeric, Hans-LanM exists in a monomer/dimer equilibrium, the position of which depends on the specific RE bound. Three X-ray crystal structures of LanMs and structure-guided mutagenesis explain Hans-LanM’s RE-dependent oligomeric state and its greater separation capacity than that of Mex-LanM. Finally, we leverage these findings to achieve single-stage Hans-LanM-based separation of the critical neodymium/dysprosium pair. These results illustrate how intermolecular interactions—common in proteins but rare in small molecules—may be exploited to improve RE separations.

Hans-LanM’s distinct selectivity profile

We have proposed4 several hallmarks of a LanM. First, LanMs possess four EF-hand motifs. EF hands comprise 12-residue, carboxylate-rich metal-binding loops flanked by α-helices, which traditionally respond to CaII binding;29 in Mex-LanM, however, EF hands 1–3 bind lanthanide(III) ions with low-picomolar affinity and 108-fold selectivity over CaII, resulting in a large, lanthanide-selective disorder-to-order conformational transition4. EF4 binds with only micromolar affinity. Second, adjacent EF hands in LanMs are separated by 12–13 residues—rather than the typical ≈25 residues in CaII-responsive EF-hand proteins—resulting in an unusual three-helix bundle architecture with the metal-binding sites on the periphery5. Third, at least one EF hand contains proline at the second position (in Mex-LanM, all four EF hands feature P2 residues). We searched sequence databases using the first two criteria and a sequence length of <200 residues, identifying 696 putative LanMs. These sequences were visualized using a sequence similarity network30 to identify LanM sequences that cluster separately from Mex-LanM. Notably, at a 65% identity threshold, a small cluster of sequences that is remote from the main cluster of 642 sequences is formed (Fig. 1a). This exclusive cluster (the Hans cluster), includes bacteria from several genera, including Hansschlegelia and Xanthobacter (Extended Data Fig. 1), all of which are facultative methylotrophs31.

Fig. 1: Hans-LanM diverges from Mex-LanM in sequence and RE versus RE selectivity.
figure 1

a, Sequence similarity network of core LanM sequences indicates that Hans-LanM forms a distinct cluster. The sequence similarity network includes 696 LanM sequences connected with 48,647 edges, thresholded at a BLAST E value of 1 × 10−5 and 65% sequence identity. The black box encloses nodes clustered with Hans-LanM. The LanM sequence associated with Mex (downtriangle) and four within Hansschlegelia (uptriangle) are enlarged compared to other nodes (circles). Colours of the nodes represent the family from which the sequences originate. b, Comparison of the sequences of the four EF hands of Mex- and Hans-LanMs. Residues canonically involved in metal binding in EF hands are in blue; Pro residues are in purple. c, Circular dichroism spectra from a representative titration of Hans-LanM with LaIII, showing the metal-associated conformational response increasing helicity; apoprotein is bold black, LaIII-saturated protein is bold red. d, Circular dichroism titration of Hans-LanM with LaIII, NdIII and DyIII (pH 5.0). Each point represents the mean ± s.d. from three independent experiments. e, Comparison of Kd,app values (pH 5.0) for Mex-LanM (black18) and Hans-LanM (red), plotted versus ionic radius7. Mean ± s.e.m. from three independent experiments.

Source Data

Hans-LanM features low (33%) sequence identity with Mex-LanM (Supplementary Fig. 1) and divergent EF-hand motifs, particularly at the first, second and ninth positions (Fig. 1b), which are important positions in Mex-LanM23,26 and other EF-hand proteins29. Therefore, Hans-LanM presented an opportunity to determine features essential for selective lanthanide recognition in LanMs.

Hans-LanM was expressed in Escherichia coli as a 110 amino acid protein (Supplementary Fig. 1). LaIII and NdIII were selected as representative LREs and DyIII was selected as a representative HRE for complexation studies. The protein binds about three equivalents of LaIII and NdIII, and slightly less DyIII, by inductively coupled plasma mass spectrometry (Supplementary Table 1), as does Mex-LanM4. Also like Mex-LanM4, Hans-LanM exhibits little helical content in the absence of metal, as judged by the circular dichroism signal at 222 nm (Fig. 1c). Unexpectedly, only two equivalents of LaIII or DyIII were sufficient to cause Hans-LanM’s complete conformational change (Supplementary Fig. 2), indicating that the third binding equivalent is weak and does not increase helicity.

The apparent dissociation constants (Kd,app) determined by circular dichroism spectroscopy4 reflect the RE versus RE, and RE versus non-RE, selectivities of Mex-LanM under competitive RE recovery conditions6,18. Therefore, similar determinations of Kd,app with free metal concentrations controlled by a competitive chelator4,32 were applied to Hans-LanM; the results (Fig. 1d and Supplementary Table 2) diverged from those for Mex-LanM. Binding of LaIII and NdIII to Hans-LanM increases molar ellipticity at 222 nm by 2.3-fold, the full conformational change evident in stoichiometric titrations. The conformational change is cooperative (Hill coefficients, n, of 2; Supplementary Table 2), and the Kd,app values are similar, 68 and 91 pM, respectively. By contrast, even though DyIII induces the same overall response as LaIII in stoichiometric titrations (Supplementary Fig. 2), in the chelator-buffered DyIII titrations Hans-LanM exhibits a lesser conformational response (1.8-fold increase). This difference indicates that at least one of the DyIII-binding sites is very weakly responsive (Kd,app > 0.3 µM, the highest concentration accessible in the chelator-buffered titrations). The main response to DyIII occurs at 2.6 nM, >30-fold higher than with the LREs, and with little or no cooperativity (n = 1.3). By contrast, Mex-LanM shows only a modest preference for LREs (about fivefold; Fig. 1e; ref. 4), and all lanthanides and YIII induce similar conformational changes and cooperativity18. Hans-LanM responds to calcium(II) weakly (Kd,app = 60 µM), with the same lack of cooperativity (n = 1.0) and partial conformational change evident with DyIII (Extended Data Fig. 2). Therefore, Hans-LanM discriminates more strongly between LREs and HREs than does Mex-LanM, with the HRE complexes exhibiting lower affinity, lesser cooperativity and a lesser primary conformational change.

LRE-selective dimerization

The distinct behaviours of the LRE– and HRE–Hans-LanM complexes suggested mechanism(s) of LRE versus HRE selectivity not present in Mex-LanM. As Mex-LanM is a monomer in complex with LREs and HREs alike4,5, we considered that LREs and HREs might induce different oligomeric states in Hans-LanM. In the presence of three equivalents of LaIII, Hans-LanM elutes from a size-exclusion chromatography (SEC) column not at the expected molecular weight (MW) of 11.9 kDa but instead at 27.8 kDa, suggestive of a dimer (Supplementary Figs. 3 and 4a). Starting gradually after NdIII but sharply at GdIII, the apparent MW decreases towards that expected for a monomer (Fig. 2a, Supplementary Fig. 4 and Supplementary Table 3). Notably, lanthanides heavier than GdIII do not seem to support growth of RE-utilizing bacteria33,34,35.

Fig. 2: A dimerization equilibrium sensitive to LRE versus HRE or non-RE coordination.
figure 2

a, Apparent molecular weight of Hans-LanM complexes with REs as determined by analytical SEC (red lines) or SEC–MALS (black dashed line). See Supplementary Table 1 for conditions. Each individual data point is the result of a single experiment. b, The LaIII-bound Hans-LanM dimer as determined by X-ray crystallography. LaIII ions are shown as green spheres and NaI ions are shown as grey spheres. c, Detailed view of the dimer interface near EF3 of chain A (blue cartoon). Arg100 from chain C (light blue cartoon) anchors a hydrogen-bonding network involving Asp93 of chain A and two EF3 LaIII ligands (Glu91 and Asp85). These interactions constitute the sole polar contacts at the dimer interface, providing a means to control the radius of the lanthanide-binding site at EF3. d, Schematic of the interactions at the dimer interface. Red dashed lines indicate hydrogen-bonding interactions and grey dashed lines indicate hydrophobic contacts. e, DENSS projections of electron density from small-angle X-ray scattering datasets for LaIII-bound (left) and DyIII-bound (right) Hans-LanM, overlaid with a PyMOL-generated ribbon diagram of the dimeric LaIIIHans-LanM crystal structure.

To provide further support for preferential dimerization in the presence of physiologically relevant LREs, RE complexes of Hans-LanM were analysed using multi-angle light scattering (MALS; Fig. 2a and Supplementary Fig. 5). The LaIII, NdIII and GdIII complexes have MWs of 22–25 kDa, indicative of dimers, but MWs decrease starting with TbIII and continue to DyIII and HoIII, at about 15 kDa (Extended Data Table 1), in agreement with the SEC data. CaII-bound Hans-LanM also indicated a MW of 14.7 kDa. The HRE–, CaII– and apo Hans-LanM complexes are still one-third larger than expected for a monomer, however, suggesting that these forms exist in a rapid equilibrium with ≈2:1 monomer/dimer ratio under these conditions. We next determined the Kd for dimerization (Kdimer) of apo, LaIII-bound and DyIII-bound Hans-LanM by isothermal titration calorimetry (Extended Data Table 2 and Supplementary Figs. 68). The apoprotein and DyIII-bound protein weakly dimerize, with Kdimer values of 117 µM and 60 µM, respectively, consistent with the ratios of monomer and dimer reflected in the SEC and MALS traces. In the presence of LaIII, however, the dimer was too tight to be able to observe monomerization by isothermal titration calorimetry, which indicates that Kdimer <0.4 µM (Supplementary Fig. 8). Thus, LaIII favours Hans-LanM’s dimerization by >100-fold over DyIII.

A 1.8-Å-resolution X-ray crystal structure of Hans-LanM in complex with LaIII confirms LRE-induced dimerization (Extended Data Fig. 3 and Supplementary Table 4). Two LanM monomers interact head-to-tail (Fig. 2b), burying about 600 Å2 of surface area through hydrophobic and polar contacts (Fig. 2c,d). These interactions occur largely between side chains contributed by the core helices α1 (between EF1 and EF2) and α2 (between EF3 and EF4; Supplementary Fig. 9). Residues at the dimer interface make direct contact with only one of the four metal-binding sites, EF3; three residues of EF3 in each monomer form a hydrogen-bonding network with Arg100 of the other monomer (Fig. 2c), suggesting that occupancy and coordination geometry at this site may control oligomeric state.

Hans-LanM and its complexes with three equivalents of LaIII, NdIII and DyIII were also analysed by small-angle X-ray scattering (Supplementary Figs. 10 and 11). The calculated solvent envelopes36 from the small-angle X-ray scattering data fit well to the crystallographic Hans-LanM dimer for LaIIIHans-LanM, adequately for NdIIIHans-LanM, but poorly for DyIIIHans-LanM (Fig. 2e and Supplementary Figs. 1214). The weaker dimerization of DyIIIHans-LanM is also supported by quantitative metrics, such as the Porod volume (Supplementary Figs. 15 and 16 and Supplementary Tables 5 and 6). Together, the biochemical and structural results indicate that Hans-LanM’s dimerization equilibrium depends strongly on the particular RE bound.

Structural basis for dimerization

The structure of LaIIIHans-LanM also provides one of the first detailed views of the coordination environments in a LanM, and indeed any natural biomolecule tasked with reversible lanthanide recognition. The previous NMR structure of Mex-LanM5 revealed the protein’s unusual fold, but it could not provide molecular-level details about the metal-binding sites. To understand the basis for LRE versus HRE discrimination, we also determined a 1.4-Å-resolution structure of DyIIIHans-LanM. Finally, we report a 1.01-Å-resolution structure of NdIIIMex-LanM, which rationalizes Mex-LanM’s shallower RE selectivity trend4.

In LaIIIHans-LanM, EF1–3 are occupied by LaIII ions (Extended Data Fig. 3b–e). EF4 is structurally distinct, does not exhibit anomalous difference density consistent with LaIII and was modelled with NaI (Supplementary Fig. 17a). Each LaIII-binding site is ten-coordinate, as observed in structures of lanthanide-dependent methanol dehydrogenases33,37 (Supplementary Fig. 18). A monodentate Asn (N1 position), four bidentate Asp or Glu residues (D3, D5, E9 and E12) and a backbone carbonyl (T7 or S7) complete the first coordination sphere in EF1–3 (Fig. 3a). Exogenous solvent ligands are not observed (Supplementary Fig. 17b); luminescence studies of EuIIIHans-LanM to determine the number of coordinated solvent molecules (q) yielded q = 0.11, consistent with the absence of solvent ligands in the X-ray structure (Supplementary Fig. 19).

Fig. 3: Hans-LanM uses an extended hydrogen-bonding network to control lanthanide selectivity.
figure 3

a, Zoomed-in views of EF2 (left) and EF3 (right) in LaIIIHans-LanM. LaIII ions are shown as green spheres. Coordination bonds and hydrogen bonds are shown as dashed lines. Residues contributed by chain A are shown in blue and those contributed by chain C (in the case of EF3) are shown in light blue. Inset: overlay of LaIIIHans-LanM (blue and light blue) with DyIIIHans-LanM (grey), showing the carboxylate shift of Glu91 from bidentate (La) to monodentate (Dy). Coordination and hydrogen bonds (dashed lines) are shown only for the Dy case. b, Representative metal-binding site (EF3) in NdIIIMex-LanM. NdIII ion is shown as an aqua sphere. Solvent molecules are shown as red spheres.

The lanthanide-binding sites in Hans-LanM additionally share extensive second-sphere interactions that may further constrain the positions of the ligands and the size of the metal-binding pore (Supplementary Fig. 20). This phenomenon is most obvious in EF3, at which the dimer interface mediates an extended hydrogen-bonding network involving several ligands. Arg100, contributed by the adjacent monomer, projects into the solvent-exposed side of EF3 to contact two carboxylate ligands, Asp85 (D3) and Glu91 (E9), enforcing their bidentate binding modes. Arg100 is also buttressed by Asp93 (EF3 D11), unique to EF3 within Hans-LanM and not observed in Mex-LanM. We tested the importance of this network in Hans-LanM dimerization by making the minimal substitution, R100K. Hans-LanM(R100K) had nearly identical Kd,app values and response to NdIII and DyIII as wild-type Hans-LanM, but the Kd,app for LaIII was twofold weaker (Supplementary Fig. 21 and Supplementary Table 7). SEC–MALS analysis indicated MWs of 10–13 kDa for apo, LaIII– and DyIIIHans-LanM(R100K) (Supplementary Fig. 22 and Supplementary Table 8), indicative of increased monomerization, especially for the LaIII complex, and suggesting that weaker dimerization may be responsible for the lower LaIII affinity. All four residues comprising the Arg100–EF3 network are completely conserved in the Hans cluster (Supplementary Fig. 23), suggesting that these interactions may contribute to dimerization in these LanMs.

The structure of DyIIIHans-LanM confirms the importance of second-sphere control of ligand positioning (Extended Data Fig. 4, Supplementary Figs. 2426 and Supplementary Tables 9 and 10). The overall structure of DyIIIHans-LanM is largely superimposable with that of LaIIIHans-LanM, and the coordination spheres of the DyIII ions in EF1–3 are similar to those in LaIIIHans-LanM (Fig. 3a, inset), with the notable exception of E9 (for example, Glu91 in EF3). This residue shifts from bidentate with LaIII to monodentate with the smaller DyIII ions, yielding a nine-coordinate distorted capped square antiprismatic geometry; the lower coordination number with a HRE ion is consistent with the case of other RE complexes38,39. In EF3, this carboxylate shift lengthens the distance between Arg100 and the proximal Oε of Glu91 from 2.9 Å (in LaIIIHans-LanM) to 3.2 Å (Supplementary Fig. 27). The rearrangement of this second-sphere hydrogen-bonding network suggests a structural basis for RE-dependent differences in Kdimer values.

The metal-binding sites of Mex-LanM differ substantially from those of Hans-LanM. In Mex-LanM, all four EF hands are occupied by nine-coordinate (EF1–3) or ten-coordinate (EF4) NdIII ions, each including two solvent ligands, not present in Hans-LanM (Fig. 3b and Supplementary Fig. 28). The observation of the two solvent molecules per metal site and the hydrogen bond to the D9 residue validates recent spectroscopic studies21,23,26. The difference in coordination number between EF1–3 and EF4 is due to the D3 carboxylate being monodentate in EF1–3 but bidentate in EF4. Although the NdIII sites of Mex-LanM share the nine- and ten-coordination observed in DyIII– and LaIIIHans-LanM, they structurally resemble the seven-coordinate CaII-binding sites of calmodulin (Supplementary Fig. 18). The increased coordination numbers in Mex-LanM relative to calmodulin result from bidentate coordination of D5 and an additional solvent ligand. These similarities suggest that much of LanM’s unique 108-fold selectivity for REs over CaII results from subtle differences in second-coordination-sphere and other more distal interactions. Finally, the exclusively protein-derived first coordination sphere in Hans-LanM, particularly due to coordination by E9, yields more extended hydrogen-bonding networks (Supplementary Figs. 20 and 29) and probably enhances control over the radius of the binding site. Thus, the structures rationalize the extraordinary RE versus non-RE selectivity of Mex-LanM and Hans-LanM while also accounting for their differences in LRE versus HRE selectivity.

Single-stage NdIII/DyIII separation

The differences in stability and structure between Hans-LanM’s LRE versus HRE complexes suggested that Hans-LanM (wild type and/or R100K) would outperform Mex-LanM in RE/RE separations. We focused on separating the RE pair of NdIII and DyIII used in permanent magnets. We first assayed the stabilities of the wild-type Hans-LanM and Hans-LanM(R100K) RE complexes against citrate, previously used as a desorbent with Mex-LanM6. RE–Hans-LanM complexes are generally less stable against citrate than those of Mex-LanM, as expected on the basis of lower affinity (Fig. 1e), but the difference in stability between the NdIIIHans-LanM and DyIIIHans-LanM complexes—expressed as the ratio of citrate concentration required for 50% desorption of each metal ([citrate]1/2), as reported by the fluorescence of Hans-LanM’s two Trp residues (Supplementary Fig. 30)—is twofold greater than for Mex-LanM complexes (Fig. 4a, Supplementary Table 11 and Extended Data Fig. 5). Furthermore, the R100K substitution significantly destabilizes Hans-LanM’s LaIII complex against citrate, whereas it only slightly affects the NdIII complex and does not affect the DyIII complex. This result confirms that dimerization selectively stabilizes Hans-LanM’s LRE complexes (and especially the LaIII complex), a factor abrogated by the R100K substitution. Using malonate, a weaker chelator than citrate, DyIII can be readily desorbed from both Hans-LanM and R100K with 10–100 mM chelator without significant NdIII desorption, suggesting conditions for NdIII/DyIII separation (Fig. 4b).

Fig. 4: Leveraging Hans-LanM to separate Nd/Dy in a single-stage process.
figure 4

a, Hans-LanM and the R100K variant exhibit greater differences in Nd versus Dy complex stability than Mex-LanM against desorption by citrate. Mean ± s.e.m. for three independent trials. **Significant difference between [citrate]1/2 for LaIII between Hans-LanM and Hans-LanM(R100K) (20 µM protein) shows the impact of dimerization of LaIII complex stability (P < 0.01, analysis of variance with Bonferroni post-test). Mex-LanM Nd and Dy data from ref. 6. b, Spectrofluorometric titration of Hans-LanM and R100K variant (λex = 280 nm, λem = 333 nm) at pH 5.0, depicting the malonate-induced desorption of a 2:1 metal–protein complex. Mean ± s.e.m. for three independent trials, except those with R100K, which were single trials of each condition. c, Comparison of distribution factors (pH 5.0, about 0.33 mM each RE, LaIII–DyIII) for immobilized Hans-LanM, Hans-LanM(R100K) and Mex-LanM. Each point represents mean ± s.d. for three independent trials. d, Separation of a 95:5 mixture of NdIII/DyIII using immobilized Hans-LanM(R100K) and a desorption scheme of three stepped concentrations of malonate followed by pH 1.5 HCl. One bed volume was 0.7 ml.

Source Data

Although a twofold modulation of RE versus RE selectivity by dimerization may seem small, such differences provide opportunity to decrease the number of separation stages, increasing efficiency of a separation process3,12. Therefore, Hans-LanM and the R100K variant were immobilized through a carboxy-terminal Cys residue on maleimide-functionalized agarose beads, as described previously6, and tested for NdIII/DyIII separation. Immobilized Hans-LanM bound about one equivalent of RE, unlike in solution and compared with two equivalents for Mex-LanM6 and Hans-LanM(R100K) (Supplementary Fig. 31). Hans-LanM and R100K exhibited similar separation ability in the La–Gd range—although R100K exhibits greater separation ability in the Gd–Dy range—as determined by the on-column distribution ratios (D) of a mixed RE solution at equilibrium (Fig. 4c, Extended Data Table 3 and Supplementary Tables 1214). These Nd/Dy separation factors are nearly double (Hans-LanM) and triple (Hans-LanM(R100K)) that of Mex-LanM (Extended Data Table 3). Immobilized Hans-LanM was loaded to 90% of breakthrough capacity with a model electronic waste mixture of 5% dysprosium and 95% neodymium and, guided by Fig. 4b, eluted with a short, stepwise malonate gradient, followed by complete desorption using pH 1.5 HCl. In a single purification stage, Dy was upgraded from 5% to 83% purity and Nd was recovered at 99.8% purity (both >98% yield; Extended Data Fig. 6). This significantly outperformed the comparable Mex-LanM-based process, which achieved only 50% purity in a first separation stage and required a second stage to obtain >98% purity6. The immobilized R100K variant performed even better, achieving baseline separation of DyIII and NdIII to >98% purity and >99% yield in a single stage (Fig. 4d). The R100K variant’s better performance was unexpected and may point to the unlikelihood of functional dimers on the column at this immobilization density (see the caption of Extended Data Fig. 6 for a discussion). Thus, despite substantially improved performance versus Mex-LanM enabled by characterization of Hans-LanM’s mechanism of dimerization, fully exploiting the dimerization phenomenon on-column may involve, for example, tethering of two monomers on a single polypeptide chain, which is under investigation.

Conclusion

Biochemical and structural characterization of Hans-LanM’s mechanism of metal-sensitive dimerization provides a new, allosteric mechanism for LRE versus HRE selectivity in biology, extending concepts in dimer-dependent metal recognition recently emerging from synthetic lanthanide complexes11 and engineered transition metal-binding proteins40 and showing that these principles are hard-wired into nature. Our work also shows that dimerization strength, and thus metal selectivity, can be rationally modulated. Hans-LanM evolved LRE-selective dimerization at physiological protein concentrations closer to those in our biochemical assays (10–20 µM) rather than those on the column (about 3 mM). Therefore, leveraging dimerization in a separation process would be assisted by shifting dimerization sensitivity to the higher concentration regime, such as by tuning hydrophobic interactions at the dimerization interface. Furthermore, our studies establish that LanMs with as low as 33% identity are easily predicted yet have useful differences in metal selectivity; further mining of this diversity may reveal yet additional mechanisms for tuning RE separations. Finally, the solvent-excluded coordination spheres of Hans-LanM should outperform Mex-LanM in RE/actinide separation23, luminescence-based sensing21,26 and stabilization of hydrolysis-prone ions. Continued characterization of the coordination and supramolecular principles of biological f-element recognition will inspire design of ligands with higher RE versus RE selectivities and their implementation in new RE separation processes.

Methods

General considerations

See the Supplementary Methods for details.

Bioinformatics methods

Protein and genome sequence data

The sequence of LanM from M. extorquens AM1 was used as a query to conduct PSI-BLAST searches against the National Center for Biotechnology Information non-redundant protein sequence (nr) and metagenomic protein (env_nr) databases until convergence41. The resulting 3,047 protein sequences were then manually curated for those that are less than 200 residues long, have at least one pair of EF hands separated by less than 14 residues, and have 4 EF hands. Signal peptides of LanM sequences were predicted using SignalP (v6.0)42, and then removed before further analysis of the sequences.

Construction of sequence similarity networks

The Enzyme Function Initiative-Enzyme Similarity Tool was used to calculate the similarities between all peptide sequence pairs with an E-value threshold of 1 × 10−5 (ref. 30). The resulting sequence similarity network of 696 nodes and 241,853 edges was then constructed and explored using the organic layout through Cytoscape (v3.9.1)43 and visualized in R (v4.1.0)44. The edge percentage identity threshold was gradually increased from 40% to 90% to yield distinct clusters.

Multiple sequence alignment and phylogenetic analysis

LanM sequences were aligned using MUSCLE (v5.1)45 with default parameters. The model used for phylogeny construction was selected using ModelFinder in IQ-TREE (v2.2.0.3)46,47 with --mset set to beast2. Bayesian phylogeny was generated on the basis of these results using BEAST (v2.6.7)48. The resulting phylogeny was evaluated using 107 generations and discarding a burn-in of 25%, and then visualized using ggtree (v3.2.1)49.

Expression and purification of Hans-LanM and its R100K variant

The gene encoding Hans-LanM, codon optimized for expression in E. coli without its native 23-residue signal peptide (see Supplementary Table 15), was obtained from Twist Bioscience and inserted into pET-29b(+) using the restriction sites NdeI/XhoI. Hans-LanM was overexpressed on a 2-l scale and purified using the established protocol for Mex-LanM50, with one modification: after the final SEC step, the protein was concentrated to 5 ml and dialysed against 5 g Chelex 100 in 500 ml of 30 mM HEPES, 100 mM KCl, 5% glycerol, pH 8.4, to remove CaII and trace metal contaminants. This procedure resulted in approximately 15 ml of 550 μM protein, which was not concentrated further. The final yield was 45 mg of protein per litre of culture. Protein concentrations were calculated using an extinction coefficient of 11,000 M−1 cm−1, based on the ExPASy ProtParam tool51. Hans-LanM(R100K) was purified using the same procedure, yielding 30 mg of protein per litre of culture.

Circular dichroism spectroscopy

Circular dichroism spectra of Hans-LanM were collected as described previously32, at 15 µM (monomer concentration) in Chelex 100-treated buffer A (20 mM acetate, 100 mM KCl, pH 5.0), unless otherwise indicated. Buffered metal solutions were prepared as described previously4,23,25,32. Additional details are available in the Supplementary Information.

Preparation of protein samples for SEC–MALS and small-angle X-ray scattering (SAXS)

Samples of wild-type Hans-LanM were prepared by adding 3.0 equivalents of metal slowly (0.5 equivalent at a time followed by mixing) to 1.0 ml of concentrated stock of Hans-LanM (550 μM). At these protein concentrations, slight precipitation was observed for LRE samples (for example, LaIII) whereas significant precipitation was observed for HRE samples (for example, DyIII). Samples were centrifuged at 12,000g for 2 min to remove precipitate and then purified using gel filtration chromatography (HiLoad 10/300 Superdex 75 pg, 1-ml loop, 0.8 ml min−1) in buffer B (30 mM MOPS, 100 mM KCl, 5% glycerol, pH 7.0). Hans-LanM-containing peaks (ranging from 12.0 to 15.0 ml elution volume) were collected to avoid the high-MW aggregate peaks, yielding 2.0 ml of metalated Hans-LanM ranging between 114 μM and 128 μM (1.37–1.53 mg ml−1).

Samples of Hans-LanM(R100K) do not form high-MW species or precipitate on metal addition. To prepare samples of this protein, a 500 μM protein solution was diluted to 250 μM (3 mg ml−1) in buffer B containing 0.75 mM of a specific RECl3, yielding a final solution of 3 mg ml−1 protein, with a 1:3 metal ratio, which was analysed directly by SEC–MALS.

For calcium conditions, proteins were diluted to 250 μM (3 mg ml−1), 5 mM CaCl2 was added, and the samples were incubated at room temperature for 1 h. The buffer used for SEC–MALS was the same as above, except that it also contained 5 mM CaCl2.

In-line SEC and MALS

SEC–MALS experiments were conducted using an Agilent 1260 Infinity II HPLC system equipped with an autosampler and fraction collector, and the Wyatt SEC hydrophilic column had 5-µm silica beads, a pore size of 100 Å and dimensions of 7.8 × 300 mm. Wyatt Technology DAWN MALS and Wyatt Optilab refractive index detectors were used for analysing the molar mass of peaks that eluted from the column. The SEC–MALS system was equilibrated for 5 h with buffer B. The system was calibrated with bovine serum albumin (monomer MW: 66 kDa) in the same buffer and normalization and alignment of the MALS and refractive index detectors were carried out. A volume of 15 µl of each sample was injected at a flow rate of 0.8 ml min−1 with a chromatogram run time of 25 min. Data were analysed using the ASTRA software (Wyatt). When small-angle X-ray scattering (SAXS) analysis was desired, a second run was carried out with 150 µl protein (about 4 mg ml−1) injected, and 200-µl fractions of the main peak were collected. BioSAXS data were subsequently collected in triplicate.

Isothermal titration calorimetry

The dissociation constants for the dimers of apo, LaIII-bound and DyIII-bound Hans-LanM were determined by dilutive additions of a concentrated protein stock, followed using isothermal titration calorimetry on a TA Instruments Low-volume Auto Affinity isothermal titration calorimeter. The syringe contained 300 μM protein (apo or 2 equivalents of DyIII bound) or 150 µM or 540 µM (2 equivalents of LaIII bound), and the cell contained 185 μl of a matched buffer (30 mM MOPS, 100 mM KCl, pH 7.0). Titrations were carried out at 30 °C. Titrations consisted of a first 0.2-μl injection followed by 17 × 2-μl injections, unless otherwise noted, with stirring at 125 r.p.m. and 180 s equilibration time between injections. The data were fitted using NanoAnalyze using the Dimer Dissociation model, yielding the dimer dissociation constant (Kdimer), enthalpy of dissociation (ΔH) and entropy of dissociation (ΔS). All parameters are shown in Extended Data Table 2.

Kdimer is defined as the dissociation constant for the equilibrium D\(\rightleftharpoons \) 2M, such that Kdimer = [M]2/[D], in which [D] is the concentration of the dimer and [M] is the concentration of the monomer, and the total protein concentration [P] (as measured using the extinction coefficient for the monomer) is given by [P] = [M] + 2[D]. Therefore, Kdimer = 2[M]2/([P] − [M]) or

$$2{[M]}^{2}+{K}_{{\rm{dimer}}}[M]-{K}_{{\rm{dimer}}}[P]=0$$
(1)

This equation can be used to estimate monomer and dimer concentrations during SEC–MALS experiments, using Kdimer values calculated from isothermal titration calorimetry experiments and [P] from the SEC–MALS trace. This equation can also be used to estimate the maximum possible Kdimer for LaIII-bound protein, given the SEC–MALS data.

SAXS

SAXS data were collected on RE-complexed Hans-LanM, at protein concentrations given in Supplementary Table 5 using equipment and under conditions described in the Supplementary Methods.

The forward scattering I(0) and the radius of gyration (Rg) are listed in Supplementary Table 5 and were calculated using the Guinier approximation, which assumes that at very small angles (q < 1.3/Rg) the intensity is approximated as I(q) =  I(0)exp[−1/3(qRg)2]. In the LaIII-, NdIII- and DyIII-bound conditions, this agrees with the calculated size of 17.9 Å for the crystallographic dimer. The molecular mass was estimated using a comparison with SAXS data of a bovine serum albumin standard. The data files were analysed for Guinier Rg, maximum particle dimension (Dmax), Guinier fits, Kratky plots and pair-distance distribution function using the ATSAS software52. GNOM, within ATSAS, was used to calculate the pair-distance distribution function P(r), from which Rg and Dmaxwere determined. Solvent envelopes were computed using DENSS36. The theoretical scattering profiles of the constructed models were calculated and fitted to experimental scattering data using CRYSOL53. OLIGOMER54 was used to estimate the monomer and dimer fractions.

Preparation of protein samples for crystallography

To Hans-LanM (2 ml, 1.16 mM, buffer B), 3.0 equivalents of LaCl3 or DyCl3 were added slowly, 0.5 equivalents at a time with mixing, to minimize precipitation. Precipitate was removed by centrifugation at 12,000g for 2 min. Any soluble aggregates were removed and the protein was exchanged into buffer lacking glycerol (buffer C: 30 mM MOPS, 50 mM KCl, pH 7.0) by gel filtration chromatography (HiLoad 16/600 Superdex 75 pg, 1-ml loop, 0.75 ml min−1). The peak in the 70–85 ml range was pooled, and the fractions were concentrated to about 500 μl with a final concentration of about 1.3 mM.

Mex-LanM was purified as described previously50 and was exchanged into buffer C before crystallization. The protein was loaded with 3.5 equivalents of NdIII (NdCl3).

General crystallographic methods

Diffraction datasets were collected at the Life Sciences Collaborative Access Team ID-G beamline and processed with the HKL2000 package55. In all structures, phase information was obtained with phenix.autosol56,57 through the single-wavelength anomalous diffraction method, in which lanthanide ions identified with HySS58 were used as the anomalous scatterers. Initial models were generated with phenix.autobuild59 with subsequent rounds of manual modification and refinement in Coot60 and phenix.refine61. In the final stages of model refinement, anisotropic displacement parameters and occupancies were refined for all lanthanide sites62. Model validation was carried out with the Molprobity server63. Figures were prepared using the PyMOL molecular graphics software package (Schrödinger, LLC).

La-bound Hans-LanM structure determination

Crystals were obtained by using the sitting drop vapour diffusion method, in which 1 μl of protein solution (15 mg ml−1) was mixed with 1 μl 10 mM tri-sodium citrate, pH 7.0, and 27% (w/v) PEG 6000 in a 24-well plate from Hampton Research (catalogue number HR1-002) at room temperature. Thin plate-shaped crystals appeared in 3 days. Crystals suitable for data collection were mounted on rayon loops, soaked briefly in a cryoprotectant solution consisting of the well solution supplemented with 10% ethylene glycol, and flash-frozen in liquid N2.

LaIII-loaded Hans-LanM crystallized in the P21 space group (β = 90.024°) with four monomers in the asymmetric unit. The initial figure of merit and Bayesian correlation coefficient were 0.563 and 0.56, respectively64. The final model consists of residues 24–133 in each chain, 12 LaIII ions (3 per chain in the first, second and third EF hands), 4 NaI ions65 (1 per chain in the fourth EF hand), 273 water molecules and 2 molecules of citrate. Of the residues modelled, 100% are in allowed or preferred regions as indicated by Ramachandran statistical analysis.

Dy-bound Hans-LanM structure determination

Crystals were obtained by using the sitting drop vapour diffusion method, in which 1 μl of protein solution (15 mg ml−1) was mixed with 1 μl of 250 μM tri-sodium citrate, pH 7.0, and 27% (w/v) PEG 6000 in a 24-well plate from Hampton Research at room temperature. Thin plate-shaped crystals appeared within 1 month. Crystals suitable for data collection were mounted on rayon loops, soaked briefly in a cryoprotectant solution consisting of the well solution supplemented with perfluoropolyether cryo oil from Hampton Research (catalogue number HR2-814) and flash-frozen in liquid N2.

DyIII-loaded Hans-LanM crystallized in the P21 space group (β = 93.567°) with four monomers in the asymmetric unit. The initial figure of merit and Bayesian correlation coefficient were 0.748 and 0.58, respectively64. The final model consists of residues 24–133 in each chain (except for chain D, for which residues 34–38 cannot be modelled), 14 DyIII ions (4 in chains A and D, 3 in the second, third and fourth EF hands of chains B and C) and 656 water molecules. Of the residues modelled, 100% are in allowed or preferred regions as indicated by Ramachandran statistical analysis.

Collection of anomalous datasets is described in the Supplementary Methods.

Nd-bound Mex-LanM structure determination

Crystals were obtained by using the sitting drop vapor diffusion method, in which 1 μl of protein solution (35 mg ml−1) was mixed with 1 μl of 0.1 M ammonium sulfate, 0.1 M Tris pH 7.5, and 20% (w/v) PEG 1500 in a 24-well plate from Hampton Research at room temperature. Thin plate-shaped crystals appeared within 6 months. Crystals suitable for data collection were mounted on rayon loops, soaked briefly in a cryoprotectant solution consisting of the well solution supplemented with perfluoropolyether cryo oil from Hampton Research and flash-frozen in liquid N2.

NdIII-loaded Mex-LanM crystallized in the P212121 space group with one monomer in the asymmetric unit. The initial figure of merit and Bayesian correlation coefficient were 0.799 and 0.56, respectively64. The final model consists of residues 29–133, 4 NdIII ions and 171 water molecules. Of the residues modelled, 100% are in allowed or preferred regions as indicated by Ramachandran statistical analysis.

Fluorescence spectroscopy

All fluorescence data were collected using a Fluorolog-QM fluorometer in configuration 75-21-C (Horiba Scientific) equipped with a double monochromator on the excitation arm and single monochromator on the emission arm. A 75-W xenon lamp was used as the light source for steady-state measurements and a pulsed xenon lamp was used for time-resolved measurements. Ten-millimetre quartz spectrofluorometry cuvettes (Starna Cells, 18F-Q-10-GL14-S) were used to collect data at 90° relative to the excitation path.

Fluorescence lifetime measurements were carried out using established methods26,66. In short, a solution of Hans-LanM with 2 equivalents of EuIII added, totalling 4.5 ml, was prepared in 100% H2O matrix (buffer: 25 mM HEPES, 75 mM KCl, pH 7.0). Half of this initial protein mixture (2.25 ml) was retained for future use and the remainder was exchanged to D2O through lyophilization to remove H2O and resuspension in 99.9% D2O two times. The resulting protein solutions (in 100% H2O and about 99% D2O) were mixed in varying ratios to produce D2O contents of 0%, 25%, 50% and 75%. The protein concentration was 20 µM. For each sample, the luminescence decay time constant (τ) was measured (λex = 394 nm, λem = 615 nm) with 5,000 shots over a time span of 2,500 μs. τ was determined using the FelixFL Powerfit-10 software (Horiba Scientific) using a single exponential fit. 1/τ was plotted against percentage composition of D2O, and the slope of the resulting line (m) was determined. The q value was determined using the following equation from ref. 67:

$$q=1.11[{\tau }_{{\rm{H}}2{\rm{O}}}^{-1}-{\tau }_{{\rm{D}}2{\rm{O}}}^{-1}-0.31+0.45{n}_{{\rm{OH}}}+0.99{n}_{{\rm{NH}}}+0.075{n}_{{\rm{O}}\mbox{--}{\rm{CNH}}}]$$
(2)

in which τ−1H2O and τ−1D2O are the inverses of the time constants in 100% H2O and D2O, respectively (the latter extrapolated using the equation of the fitted line), in ms–1; and nOH = 0, nNH = 0, and nO–CNH = 1 (resulting from the metal-coordinated Asn residues), on the basis of the Hans-LanM crystal structures. This equation simplifies to:

$$q=1.11[-m-0.31+0.075]$$
(3)

For fluorescence competition experiments, a solution of 20 μM Hans-LanM or the R100K variant was prepared in buffer A (pH 5.0) with two equivalents of metal (40 μM). Fluorescence emission spectra were collected with settings: λex = 278 nm, λem 300–420 nm, integration time = 0.5 s, step size = 1 nm. Titrations were carried out through addition of at least 0.6 μl of titrant (from concentrated stock solutions of 10 mM–1 M citrate or malonate, pH 5.0). Spectra were corrected for dilution. Each experiment was carried out in triplicate.

Purification of Cys-containing variants

Hans-LanM(R100K)-Cys was expressed and purified as described for Mex-LanM-Cys (ref. 6), with a final yield of 50 mg of protein per litre of culture. For Hans-LanM-Cys, the protein was purified by incorporating the same modifications from above, minus the dialysis step, to our previously described Mex-LanM-Cys purification, except that the SEC step was run using a reducing buffer (30 mM MOPS, 100 mM KCl, 5 mM TCEP, pH 7.0) with 5 mM EDTA, and frozen under liquid N2 before immobilization.

Maleimide functionalization of agarose beads

The maleimide functionalization of amine-functionalized agarose beads was described previously6. See the Supplementary Information for complete details.

Immobilization of Hans-LanM and the R100K variant

Hans-LanM(R100K) immobilization was carried out using a thiol-maleimide conjugation reaction as described previously6. In the case of Hans-LanM, a final protein concentration of about 0.4 mM (8 ml) was combined with 1 ml of maleimide–microbeads and the conjugation reaction was carried out for 16 h at room temperature. Unconjugated Hans-LanM was removed by washing with coupling buffer, and the Hans-LanM microbeads were stored in coupling buffer for subsequent tests. To quantify Hans-LanM immobilization yield, Pierce BCA Protein Assay (ThermoFisher Scientific) was used to determine the LanM concentration in the reaction solution before and after the conjugation reaction as previously described.

Batch experiment to determine separation factors

LanM-immobilized microbeads were washed with deionized water. Feed solution (5 ml, equimolar REs La–Dy, 3 mM total, pH 5.0) was added to 1 ml microbeads and incubated for 2 h. The liquid at equilibrium was collected and RE concentrations were determined by inductively coupled plasma mass spectrometry as [M]ad. Then 4 ml of 0.1 M HCl was used to desorb REs from the microbeads and concentrations were measured by inductively coupled plasma mass spectrometry as [M]de.

The RE distribution factor (D) between the LanM phase and the solution phase was calculated as:

$$D=\frac{{[M]}_{{\rm{LanM}}}}{{[M]}_{{\rm{Liquid}}}}$$
(4)

in which [M]LanM and [M]Liquid are the molar concentrations of each metal ion in the LanM phase and the solution phase at equilibrium, respectively. To account for the free liquid that absorbed on the agarose microbeads, the following correction was applied: [M]Liquid = [M]ad; [M]LanM = (4 × [M]de – [M]ad)/4.

The separation factor is defined as:

$${\rm{SF}}=\frac{{D}_{{\rm{RE1}}}}{{D}_{{\rm{RE2}}}}$$
(5)

in which DRE1 and DRE2 are the distribution factors of RE1 and RE2, respectively.

Breakthrough column experiments

Columns were filled and run, and metal concentrations analysed, as described in our previous work6; details are available in the Supplementary Methods.

For the RE pair separation experiments, the metal ion purity and yield are defined as:

$${{\rm{Purity}}}_{{\rm{RE}}1}=\frac{{C}_{{\rm{RE}}1}}{{C}_{{\rm{RE}}1}+{C}_{{\rm{RE}}2}}$$
(6)
$${{\rm{Y}}{\rm{i}}{\rm{e}}{\rm{l}}{\rm{d}}}_{{\rm{R}}{\rm{E}}1}=\frac{{\rm{R}}{\rm{E}}1\,{\rm{r}}{\rm{e}}{\rm{c}}{\rm{o}}{\rm{v}}{\rm{e}}{\rm{r}}{\rm{e}}{\rm{d}}}{{\rm{T}}{\rm{o}}{\rm{t}}{\rm{a}}{\rm{l}}\,{\rm{R}}{\rm{E}}1\,{\rm{l}}{\rm{o}}{\rm{a}}{\rm{d}}{\rm{e}}{\rm{d}}}$$
(7)

in which CRE1 and CRE2 are the molar concentrations of RE1 and RE2, respectively.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.