Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Sulfated glycan recognition by carbohydrate sulfatases of the human gut microbiota

An Author Correction to this article was published on 05 August 2022

This article has been updated

Abstract

Sulfated glycans are ubiquitous nutrient sources for microbial communities that have coevolved with eukaryotic hosts. Bacteria metabolize sulfated glycans by deploying carbohydrate sulfatases that remove sulfate esters. Despite the biological importance of sulfatases, the mechanisms underlying their ability to recognize their glycan substrate remain poorly understood. Here, we use structural biology to determine how sulfatases from the human gut microbiota recognize sulfated glycans. We reveal seven new carbohydrate sulfatase structures spanning four S1 sulfatase subfamilies. Structures of S1_16 and S1_46 represent novel structures of these subfamilies. Structures of S1_11 and S1_15 demonstrate how non-conserved regions of the protein drive specificity toward related but distinct glycan targets. Collectively, these data reveal that carbohydrate sulfatases are highly selective for the glycan component of their substrate. These data provide new approaches for probing sulfated glycan metabolism while revealing the roles carbohydrate sulfatases play in host glycan catabolism.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Schematic representation of sulfated host carbohydrates found in the colon.
Fig. 2: S1 carbohydrate sulfatases share a conserved α/β/α-fold, S site and catalytic apparatus.
Fig. 3: S1_46 members from the HGM require recognition of the N-acetyl group for activity.
Fig. 4: Aromatic stacking analysis at the 0 subsite in S1_16 members from the HGM.
Fig. 5: Specificity of S1_15 subfamily members for 6S-Gal and 6S-GalNAc is determined by the openness of the S and 0 subsites.
Fig. 6: Variations in the non-conserved region of S1_11 members, dictated by PUL context, drive recognition of N-sulfate or N-acetyl.

Similar content being viewed by others

Data availability

The crystal structure datasets generated have been deposited in the Protein Data Bank (PDB) under the following accession numbers: 7OZ8, 7OZ9, 7OZA, 7OZE, 7OZC, 7P26 and 7P24. Source data are provided with this paper.

Code availability

No new code was developed or compiled in this study.

Change history

References

  1. Sarrazin, S., Lamanna, W. C. & Esko, J. D. Heparan sulfate proteoglycans. Cold Spring Harb. Perspect. Biol. 3, a004952 (2011).

  2. Soares da Costa, D., Reis, R. L. & Pashkuleva, I. Sulfation of glycosaminoglycans and its implications in human health and disorders. Annu. Rev. Biomed. Eng. 19, 1–26 (2017).

    Article  CAS  PubMed  Google Scholar 

  3. Luis, A. S. et al. A single sulfatase is required to access colonic mucin by a gut bacterium. Nature 598, 332–337 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Bloom, S. M. et al. Commensal Bacteroides species induce colitis in host-genotype-specific fashion in a mouse model of inflammatory bowel disease. Cell Host Microbe 9, 390–403 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Johansson, M. E., Larsson, J. M. & Hansson, G. C. The two mucus layers of colon are organized by the MUC2 mucin, whereas the outer layer is a legislator of host–microbial interactions. Proc. Natl Acad. Sci. USA 108, 4659–4665 (2011).

    Article  CAS  PubMed  Google Scholar 

  6. Field, C. B., Behrenfeld, M. J., Randerson, J. T. & Falkowski, P. Primary production of the biosphere: integrating terrestrial and oceanic components. Science 281, 237–240 (1998).

    Article  CAS  PubMed  Google Scholar 

  7. Chen, J. et al. Laminarin, a major polysaccharide in stramenopiles. Mar. Drugs 19, 576 (2021).

  8. Hettle, A. G. et al. Insights into the κ/ι-carrageenan metabolism pathway of some marine Pseudoalteromonas species. Commun. Biol. 2, 474 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Reisky, L. et al. A marine bacterial enzymatic cascade degrades the algal polysaccharide ulvan. Nat. Chem. Biol. 15, 803–812 (2019).

    Article  CAS  PubMed  Google Scholar 

  10. Zhang, Q. et al. Chemical characteristics of a polysaccharide from Porphyra capensis (Rhodophyta). Carbohydr. Res. 340, 2447–2450 (2005).

    Article  CAS  PubMed  Google Scholar 

  11. Ponce, N. M. A. & Stortz, C. A. A comprehensive and comparative analysis of the fucoidan compositional data across the Phaeophyceae. Front. Plant Sci. 11, 556312 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  12. Panggabean, J. A. et al. Antiviral activities of algal-based sulfated polysaccharides. Molecules 27, 1178 (2022).

  13. Pereira, L. Carrageenans: Sources and Extraction Methods, Molecular Structure, Bioactive Properties and Health Effects (Nova Science Publishers, 2016).

  14. Cartmell, A. et al. How members of the human gut microbiota overcome the sulfation problem posed by glycosaminoglycans. Proc. Natl Acad. Sci. USA 114, 7037–7042 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Tuncil, Y. E. et al. Reciprocal prioritization to dietary glycans by gut bacteria in a competitive environment promotes stable coexistence. mBio 8, e01068-17 (2017).

  16. Raghavan, V. & Groisman, E. A. Species-specific dynamic responses of gut bacteria to a mammalian glycan. J. Bacteriol. 197, 1538–1548 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Cheng, H. Y., Ning, M. X., Chen, D. K. & Ma, W. T. Interactions between the gut microbiota and the host innate immune response against pathogens. Front. Immunol. 10, 607 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. McNeil, N. I. The contribution of the large intestine to energy supplies in man. Am. J. Clin. Nutr. 39, 338–342 (1984).

    Article  CAS  PubMed  Google Scholar 

  19. Goodman, A. L. et al. Identifying genetic determinants needed to establish a human gut symbiont in its habitat. Cell Host Microbe 6, 279–289 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Li, H. et al. The outer mucus layer hosts a distinct intestinal microbial niche. Nat. Commun. 6, 8292 (2015).

    Article  CAS  PubMed  Google Scholar 

  21. Tsai, H. H., Dwarakanath, A. D., Hart, C. A., Milton, J. D. & Rhodes, J. M. Increased faecal mucin sulphatase activity in ulcerative colitis: a potential target for treatment. Gut 36, 570–576 (1995).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Alipour, M. et al. Mucosal barrier depletion and loss of bacterial diversity are primary abnormalities in paediatric ulcerative colitis. J. Crohns Colitis 10, 462–471 (2016).

    Article  PubMed  Google Scholar 

  23. Hickey, C. A. et al. Colitogenic Bacteroides thetaiotaomicron antigens access host immune cells in a sulfatase-dependent manner via outer membrane vesicles. Cell Host Microbe 17, 672–680 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Barbeyron, T. et al. Matching the diversity of sulfated biomolecules: creation of a classification database for sulfatases reflecting their substrate specificity. PLoS ONE 11, e0164846 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  25. Hanson, S. R., Best, M. D. & Wong, C. H. Sulfatases: structure, mechanism, biological activity, inhibition, and synthetic utility. Angew. Chem. Int. Ed. Engl. 43, 5736–5763 (2004).

    Article  CAS  PubMed  Google Scholar 

  26. Hettle, A. G. et al. The molecular basis of polysaccharide sulfatase activity and a nomenclature for catalytic subsites in this class of enzyme. Structure 26, 747–758 (2018).

    Article  CAS  PubMed  Google Scholar 

  27. Terrapon, N., Lombard, V., Gilbert, H. J. & Henrissat, B. Automatic prediction of polysaccharide utilization loci in Bacteroidetes species. Bioinformatics 31, 647–655 (2015).

    Article  CAS  PubMed  Google Scholar 

  28. Ndeh, D. et al. Metabolism of multiple glycosaminoglycans by Bacteroides thetaiotaomicron is orchestrated by a versatile core genetic locus. Nat. Commun. 11, 646 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Wei, W., Ninonuevo, M. R., Sharma, A., Danan-Leon, L. M. & Leary, J. A. A comprehensive compositional analysis of heparin/heparan sulfate-derived disaccharides from human serum. Anal. Chem. 83, 3703–3708 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Sidhu, N. S. et al. Structure of sulfamidase provides insight into the molecular pathology of mucopolysaccharidosis IIIA. Acta Crystallogr. D Biol. Crystallogr. 70, 1321–1335 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. von Bulow, R. et al. Defective oligomerization of arylsulfatase A as a cause of its instability in lysosomes and metachromatic leukodystrophy. J. Biol. Chem. 277, 9455–9461 (2002).

    Article  Google Scholar 

  32. Robb, C. S. et al. Metabolism of a hybrid algal galactan by members of the human gut microbiome. Nat. Chem. Biol. 18, 501–510 (2022).

    Article  CAS  PubMed  Google Scholar 

  33. Juers, D. H. et al. A structural view of the action of Escherichia coli (lacZ) β-galactosidase. Biochemistry 40, 14781–14794 (2001).

    Article  CAS  PubMed  Google Scholar 

  34. Helbert, W. et al. Discovery of novel carbohydrate-active enzymes through the rational exploration of the protein sequences space. Proc. Natl Acad. Sci. USA 116, 6063–6068 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Lapebie, P., Lombard, V., Drula, E., Terrapon, N. & Henrissat, B. Bacteroidetes use thousands of enzyme combinations to break down glycans. Nat. Commun. 10, 2043 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  36. Pudlo, N. A. et al. Diverse events have transferred genes for edible seaweed digestion from marine to human gut bacteria. Cell Host Microbe 30, 314–328 (2022).

  37. Verma, S. et al. Identification and engraftment of new bacterial strains by shotgun metagenomic sequence analysis in patients with recurrent Clostridioides difficile infection before and after fecal microbiota transplantation and in healthy human subjects. PLoS ONE 16, e0251590 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Roche, P. et al. Molecular basis of symbiotic host specificity in Rhizobium meliloti: nodH and nodPQ genes encode the sulfation of lipo-oligosaccharide signals. Cell 67, 1131–1143 (1991).

    Article  CAS  PubMed  Google Scholar 

  39. Varki, A. et al. Symbol nomenclature for graphical representations of glycans. Glycobiology 25, 1323–1324 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Das, T. M., Rao, C. P. & Kolehmainen, E. Synthesis and characterisation of N-glycosyl amines from the reaction between 4,6-O-benzylidene-d-glucopyranose and substituted aromatic amines and also between 2-(o-aminophenyl)benzimidazole and pentoses or hexoses. Carbohydr. Res. 334, 261–269 (2001).

    Article  CAS  PubMed  Google Scholar 

  41. Byrne, D. P., London, J. A., Eyers, P. A., Yates, E. A. & Cartmell, A. Mobility shift-based electrophoresis coupled with fluorescent detection enables real-time enzyme analysis of carbohydrate sulfatase activity. Biochemical J. 478, 735–748 (2021).

    Article  CAS  Google Scholar 

  42. Labourel, A. et al. Structural and functional analysis of glycoside hydrolase 138 enzymes targeting chain A galacturonic acid in the complex pectin rhamnogalacturonan II. J. Biol. Chem. 294, 7711–7721 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Byrne, D. P. et al. cAMP-dependent protein kinase (PKA) complexes probed by complementary differential scanning fluorimetry and ion mobility-mass spectrometry. Biochemical J. 473, 3159–3175 (2016).

    Article  CAS  Google Scholar 

  44. Kabsch, W. Xds. Acta Crystallogr. D Biol. Crystallogr. 66, 125–132 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Evans, P. Scaling and assessment of data quality. Acta Crystallogr. D Biol. Crystallogr. 62, 72–82 (2006).

    Article  PubMed  Google Scholar 

  46. Evans, P. R. An introduction to data reduction: space-group determination, scaling and intensity statistics. Acta Crystallogr. D Biol. Crystallogr. 67, 282–292 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Long, F., Vagin, A. A., Young, P. & Murshudov, G. N. BALBES: a molecular-replacement pipeline. Acta Crystallogr. D Biol. Crystallogr. 64, 125–132 (2008).

    Article  CAS  PubMed  Google Scholar 

  48. McCoy, A. J. Solving structures of protein complexes by molecular replacement with Phaser. Acta Crystallogr. D Biol. Crystallogr. 63, 32–41 (2007).

    Article  CAS  PubMed  Google Scholar 

  49. Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. D Biol. Crystallogr 66, 486–501 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Murshudov, G. N. et al. REFMAC5 for the refinement of macromolecular crystal structures. Acta Crystallogr. D Biol. Crystallogr. 67, 355–367 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Lebedev, A. A. et al. JLigand: a graphical tool for the CCP4 template-restraint library. Acta Crystallogr. D Biol. Crystallogr. 68, 431–440 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Chen, V. B. et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr. D Biol. Crystallogr. 66, 12–21 (2010).

    Article  CAS  PubMed  Google Scholar 

  53. Potterton, L. et al. CCP4i2: the new graphical user interface to the CCP4 program suite. Acta Crystallogr. D Struct. Biol. 74, 68–84 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Collaborative Computational Project The CCP4 suite: programs for protein crystallography. Acta Crystallogr. D Biol. Crystallogr. 50, 760–763 (1994).

    Article  Google Scholar 

  55. Katoh, K., Misawa, K., Kuma, K. & Miyata, T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Clamp, M., Cuff, J., Searle, S. M. & Barton, G. J. The Jalview Java alignment editor. Bioinformatics 20, 426–427 (2004).

    Article  CAS  PubMed  Google Scholar 

  57. Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Felsenstein, J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17, 368–376 (1981).

    Article  CAS  PubMed  Google Scholar 

  59. Le, S. Q. & Gascuel, O. An improved general amino acid replacement matrix. Mol. Biol. Evol. 25, 1307–1320 (2008).

    Article  CAS  PubMed  Google Scholar 

  60. Felsenstein, J. Confidence limits on phylogenies: an approach using the Bootstrap. Evolution 39, 783–791 (1985).

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

This project has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement number 748336 and Wilhelm och Martina Lundgrens Vetenskapsfond (2020.3597) awarded to A.S.L. Additional funding sources include The European Research Council ERC (694181), the Knut and Alice Wallenberg Foundation (2017.0028), Swedish Research Council (2017-00958) awarded to G.C.H., the National Institute of Health (R01 DK118024 and DK125445 and U01AI095473) award to E.C.M. and G.C.H. and the Academy of Medical Sciences/Wellcome Trust through the Springboard Grant (SBF005\1065 163470) awarded to A.C. We acknowledge access to the SOLEIL and Diamond Light sources via both the University of Liverpool and Newcastle University BAGs (proposals mx21970 and mx18598, respectively). We thank the staff of DIAMOND and SOLEIL and members of the Liverpool’s Molecular biophysics group for assistance with data collection. We are also grateful for E. Corre’s help regarding bioinformatics analyses (ABIMS platform, Station Biologique de Roscoff, France).

Author information

Authors and Affiliations

Authors

Contributions

A.S.L. and A.C. conceived and designed experiments. A.C., A.S.L. and E.C.M. wrote the draft manuscript. A.S.L. and A.C. cloned, expressed and purified sulfatases and performed the enzymatic assays. A.C., D.P.B. and J.A.L. performed and analyzed kinetic and binding experiments. E.A.Y. and J.A.L. performed labeling and NMR experiments. A.C. and A.B. performed structural biology experiments. M.C. and T.B. performed sulfatase phylogenetic analyses. J.C. and N.G.K. performed glycan analyses. G.S.A.W. performed light scattering experiments and size determination. A.C., A.S.L., G.C.H. and E.C.M. supervised and provided funding for the project. All authors read and approved the manuscript.

Corresponding authors

Correspondence to Ana S Luis or Alan Cartmell.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Chemical Biology thanks Jan-Hendrick Hehemann, Nicolas Terrapon and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Electron density maps of extracted ligands.

The 2mFobs-Fc maps are shown contoured at 1σ for all substrates and products co-crystallised with their respective sulfatase.

Extended Data Fig. 2 Biophysical, Specificity and phylogenetic analysis of BT19183S-GlcNAc and S1_46.

a, Radial version of the phylogenetic tree of representative sulfatases from subfamily S1_46. The tree comprises a total of 564 sequences with 250 being Firmicutes; 156 are Bacteroidetes; 54 are Actinobacteria; 25 are Proteobacteria; 20 are Lentisphaerae. For clarity all labels and sequence accession codes have been omitted. The annotations next to the colour code reveal the presence or absence of conserved residues crucial for substrate recognition by BT19183S-GlcNAc (acc-code Q8A6G6) in the following order: Y94, N174, R327 and Y408. These residues are invariant in HGM Bacteroidetes, whilst in Firmicutes from the HGM, the Y408 equivalent is not conserved. The residues are coloured as following: black means an equivalent residue is present; a grey and bold letter at any position means that the corresponding residue is replaced by that amino acid; a grey, bold and italic letter at any position means that the equivalent position is replaced by any type of amino acid; a bold grey letter followed by one-letter codes in parentheses indicates that the equivalent position can be substituted by any of those amino acids; the dash at the Y408-equivalent position indicates that no equivalent amino acid can be deduced from the multiple alignment. Branches of the same colour have the corresponding pattern in common. The red filled circle designates the sequence of the S1_46 sulfatase from B. thetaiotaomicron (See Supplementary Fig. 2 for full tree). b, Thin layer chromatography analysis of BT19183S,6S-GlcNAc versus 3S-glucosamine, 3 S,6S-glucosamine, and 3S,6S-N-acetylglucosamine, and QSI_25163S,6S-GlcNAc versus 3S-glucosamine. 3 S,6S-glucosamine, and 3S,6S-N-acetylglucosamie. All assays described were performed for 48 h at 37 °C, containing 6 mM substrate and 5 μM (BT19183S-GlcNAc) or 100 μM enzyme (QSI_25163S,6S-GlcNAc) and 3 mM HEPES pH 7.0, 45 mM NaCl and 5 mM CaCl2.

Source data

Extended Data Fig. 3 Activity and stability analysis of S1_16 sulfatases and their mutant variants.

a, DSF analysis of the effects of galactose and N-acetylgalactosamine on thermostability, with a positive-shift indicative of substrate binding. b, Normalised DSF melt curves of BT30574S-Gal/GalNAc and BT37964S-Gal/GalNAc. c, DSF melt curves of purified monomer and dimer species (left), monomer species in the presence of galactose and N-acetylgalactosamine (middle), or dimer species in the presence of galactose and N-acetylgalactosamine (right). d, Thin layer chromatography (TLC) analysis of wild-type (WT) and mutant S1_16 sulfatases. Asterisks are placed above lanes where sulfatase activity is observed. e, High pressure anion exchange chromatography (HPAEC) of WT and mutants. A grey block highlights the desulfated product. Both TLC and HPAEC reactions utilised 6 mM substrate and 1 μM enzyme, except for W109A variants where 10 μM was used, with 3 mM HEPES, 45 mM NaCl, and 5 mM CaCl2. Reactions incubated at 37 °C for 48 h. Control represents the substrate incubated in same conditions without adding enzyme. Experiments are technical triplicates and error bars represent SEM.

Source data

Extended Data Fig. 4 Radial phylogenetic tree of representative sulfatases from subfamily S1_16.

The tree comprises a total of 1368 sequences of which 854 are Bacteroidetes; 211 are Planctomycetes; 107 are Kiritimatiellaeota; 64 are Verrucomicrobia; 53 are Lentisphaerae. For clarity all labels and sequence accession codes have been omitted. The annotations next to the colour code reveal the presence or absence of conservation of the critical residues in substrate recognition by BT30574S-Gal/GalNAc (acc-code Q8A397) in the following order: W109, H182, and H423. Sequences coded by orange branches contain an additional W332 present in BT37964S-Gal/GalNAc (acc-code Q8A171) but absent in other sequences. For simplification the residue numbers have been omitted, except for W332. The residues are coloured as following: black means an equivalent amino acid is present; a grey and bold letter at any position means that the corresponding residue is replaced by that amino acid; a grey and italic letter at any position means that the equivalent position is replaced by any type of amino acid; a bold grey letter followed by one-letter codes in parentheses indicates that the equivalent position can be substituted by any of those amino acids; the dash at the H-equivalent position indicates that no equivalent amino acid can be deduced from the multiple alignment. When two patterns are indicated separated by a comma (that is W - H, W H -) both have been given the same colour code. Branches having the same colour have the corresponding pattern in common. Red filled circles designate sequences of S1_16 sulfatases from B. thetaiotaomicron (See Supplementary Fig. 4 for full tree).

Extended Data Fig. 5 Analysis of the activity and stability of BT16246S-Gal/GalNAc and its mutant variants.

a, Thin layer chromatography (TLC) analysis of wild-type (WT) BT16246S-Gal/GalNAc and its mutants. Asterisks are placed above lanes where activity is observed. b, High pressure anion exchange chromatography (HPAEC) of wild-type BT16246S-Gal/GalNAc WT and its mutants. The desulfated product is highlighted by a grey box. All TLC (a) HPAEC (b) reactions utilised 6 mM substrate and 5 μM enzyme, with 3 mM HEPES, 45 mM NaCl, and 5 mM CaCl2. Reactions incubated for 48 h at 37 °C. Control represents the substrate incubated in same conditions without adding enzyme. c, DSF analysis showing relative thermostability of BT16246S-Gal/GalNAc mutant proteins with respect to the WT enzyme. d, DSF analysis of the effects of alanine scanning on the ability of BT16246S-Gal/GalNAc to bind galactose, with the Tm of the protein shown above the bar. The experiments were performed using 5 μM of protein and 324 mM of galactose in 100 mM BTP pH 7.0 and 150 mM NaCl. Experiments are technical triplicates and error bars represent SEM.

Source data

Extended Data Fig. 6 Radial phylogenetic tree of S1_15 showing the conservation of the galacto- recognition triad and N-acetyl-D-galactosamine specificity features.

The tree comprises a total of 1906 sequences of which 1424 are Bacteroidetes; 172 are Planctomycetes; 119 are Kiritimatiellaeota; 57 are Proteobacteria; 53 are Verrucomicrobia. The annotations next to the colour code concern the presence or absence of conservation of the BT16246S-Gal/GalNAc (acc-code Q8A7A1) indicated residues and in this order: I100, D170, R171, H220, K461 and A462. These residues are crucial in substrate recognition and D170, R171, and H220 represent the galacto-recognition triad within S1_15 subfamily. For simplification the residue numbers have been omitted. For example, an I in black means an equivalent isoleucine is present; a grey and bold letter at any position means that the corresponding residue is replaced by that amino acid; a grey and italic letter at any position means that the equivalent position can be replaced by any type of amino acid; a bold grey letter followed by one-letter codes in parentheses indicates that the equivalent position is substituted by any of those amino acids. Branches having the same colour have the corresponding pattern in common. For clarity all labels and sequence accession codes have been omitted. Red filled circles designate sequences of S1_15 sulfatases from B. thetaiotaomicron (See Supplementary Fig. 6 for full tree).

Extended Data Fig. 7 Conservation of the N-acetyl-D-galactosamine specificity features (Y463/W464) in S1_15 enzymes within PULs targeting chondroitin sulfate.

Schematic representation of PULs targeting chondroitin sulfate aligned by orthologues of BT33336S-GalNAc. Light green background shows orthologues with Y463/W464, a dark green background highlights orthologues with F463/W464, a light blue background highlights orthologues with H463/W464, and a purple background highlights orthologues with Q463/W464. The numbering used corresponds to the sequence of BT33336S-GalNAc. A red background highlights the presence of GH88 and S1_27 (an endo 4S-chrondroitin sulfatase), which is encoded by a discrete genetic region not always physically localised next to the core PUL. A black background highlights a core block observed in CS PULs containing BT33336S-GalNAc orthologues. HP (protein of unknown function), S1 (sulfatase S1 with the respective subfamily number superscript), GHXX (glycoside hydrolase with X representing the family number), PL (polysaccharide lyase), DUF (domain of unknown function), HTCS (hybrid two-component system), SusC (starch utilization system C-like), SusD (starch utilization system D-like), SGBP (surface glycan binding protein).

Extended Data Fig. 8 Analysis of the activity and stability of BT31776S-GlcNAc and mutant variants.

a, High pressure anion exchange chromatography (HPAEC) of wild-type BT31776S-GlcNAc wild-type (WT) and substituted variants. The produced product is highlighted by a grey box. b, Thin layer chromatography (TLC) analysis of WT BT31776S-GlcNAc and its mutants. Asterisks are placed above lanes where activity is observed. Both HPAEC (a) and TLC (b) reactions utilised 6 mM substrate and 5 μM enzyme, with 3 mM HEPES, 45 mM NaCl, and 5 mM CaCl2 over a 48 h period at 37 °C. Control represents the substrate incubated in same conditions without enzyme. c, DSF analysis showing relative thermostability of mutant proteins of BT31776S-GlcNAc in comparison to the WT enzyme. Experiments are technical triplicates and error bars represent SEM.

Source data

Extended Data Fig. 9 Radial phylogenetic tree of S1_11 showing the conservation of the substrate recognition triad and N-sulfate specificity features.

The tree comprises a total of 2178 sequences of which 1190 are Bacteroidetes; 233 are Verrucomicrobia;184 are Planctomycetes;143 are Ascomycota (fungi); 100 are Actinobacteria.The annotations next to the colour code concern the presence or absence of conservation of the indicated residues and in this order: R290, W273, D385, R387 and H471. These residues are required for substrate recognition by BT46566S-GlcNAc/GlcNS (acc-code Q89YS5). D385, R387, and H471 represent the recognition triad, whilst the presence of W or R at positions 273 and 290, respectively, represent N-sulfate specificity features. Residue numbers have been omitted for simplicity. For example, an R in black means an equivalent arginine is present; a grey and bold letter at this position means that the corresponding residue is replaced by that amino acid; the grey and italic R at this position means that the R-equivalent position is replaced by any type of amino acid; a bold grey R followed by one-letter codes in parentheses indicates that the R-equivalent position can be substituted by any of those amino acids; the dash at the R-equivalent position indicates that no equivalent amino acid can be deduced from the multiple alignment. Branches having the same colour have the corresponding pattern in common. Red filled diamonds designate sequences of S1_11 sulfatases from B. thetaiotaomicron. All sequences in the specific branch that contains BT46566S-GlcNAc/GlcNS are found within a conserved heparan sulfate PUL. For clarity, all labels and sequence accession codes have been omitted (See Supplementary Fig 7 for full tree).

Extended Data Fig. 10 Conservation of the N-sulfate targeting features, W273/R290, in S1_11 enzymes within PULs targeting heparan sulfate.

PULs targeting heparan sulfate (HS) aligned by orthologues of BT46566S-GlcNAc/GlcNS. Orthologues of BT46566S-GlcNAc/GlcNS with W273/R290 and W273/Q290 are highlighted with a green and blue background, respectively. A black background highlights a core block observed in HS PULs containing BT46566S-GlcNAc/GlcNS orthologues. HP (protein of unknown function), S1 (sulfatase S1 with the respective subfamily number superscript), GHXX (glycoside hydrolase with X representing the family number), PL (polysaccharide lyase), DUF (domain of unknown function), HTCS (hybrid two-component system), SusC (starch utilization system C-like), SusD (starch utilization system D-like), SGBP (surface glycan binding protein), MFS (major facilitator superfamily), ROK (repressor, ORF, kinase superfamily).

Supplementary information

Source data

Source Data Fig. 3

Raw and processed kinetic data.

Source Data Fig. 4

Raw and processed kinetic data.

Source Data Fig. 5

Raw and processed DSF data.

Source Data Fig. 6

Raw and processed DSF data.

Source Data Extended Data Fig. 2

Unmodified TLC gels.

Source Data Extended Data Fig. 3

Unmodified TLC gels and DSF data.

Source Data Extended Data Fig. 5

Unmodified TLC gels and DSF data.

Source Data Extended Data Fig. 8

Unmodified TLC gels and DSF data.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luis, A.S., Baslé, A., Byrne, D.P. et al. Sulfated glycan recognition by carbohydrate sulfatases of the human gut microbiota. Nat Chem Biol 18, 841–849 (2022). https://doi.org/10.1038/s41589-022-01039-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41589-022-01039-x

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing