A phylogenetic and proteomic reconstruction of eukaryotic chromatin evolution

Grau-Bové, Xavier; Navarrete, Cristina; Chiva, Cristina; Pribasnig, Thomas; Antó, Meritxell; Torruella, Guifré; Galindo, Luis Javier; Lang, Bernd Franz; Moreira, David; López-Garcia, Purificación; Ruiz-Trillo, Iñaki; Schleper, Christa; Sabidó, Eduard; Sebé-Pedrós, Arnau

doi:10.1038/s41559-022-01771-6

Article
Published: 09 June 2022

A phylogenetic and proteomic reconstruction of eukaryotic chromatin evolution

Nature Ecology & Evolution volume 6, pages 1007–1023 (2022)Cite this article

6221 Accesses
18 Citations
242 Altmetric
Metrics details

Subjects

Abstract

Histones and associated chromatin proteins have essential functions in eukaryotic genome organization and regulation. Despite this fundamental role in eukaryotic cell biology, we lack a phylogenetically comprehensive understanding of chromatin evolution. Here, we combine comparative proteomics and genomics analysis of chromatin in eukaryotes and archaea. Proteomics uncovers the existence of histone post-translational modifications in archaea. However, archaeal histone modifications are scarce, in contrast with the highly conserved and abundant marks we identify across eukaryotes. Phylogenetic analysis reveals that chromatin-associated catalytic functions (for example, methyltransferases) have pre-eukaryotic origins, whereas histone mark readers and chaperones are eukaryotic innovations. We show that further chromatin evolution is characterized by expansion of readers, including capture by transposable elements and viruses. Overall, our study infers detailed evolutionary history of eukaryotic chromatin: from its archaeal roots, through the emergence of nucleosome-based regulation in the eukaryotic ancestor, to the diversification of chromatin regulators and their hijacking by genomic parasites.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Diversity of post-translational modifications in eukaryotic canonical and variant histones.**

**Fig. 2: Archaeal histone diversity and post-translational modifications.**

**Fig. 3: Taxonomic distribution of chromatin-associated gene classes.**

**Fig. 4: Origin and evolution of chromatin-associated gene families.**

**Fig. 5: Evolution of chromatin readers and capture of chromatin proteins by TEs and viruses.**

**Fig. 6: Chromatin evolution and eukaryogenesis.**

The variation and evolution of complete human centromeres

Article Open access 03 April 2024

Glennis A. Logsdon, Allison N. Rozanski, … Evan E. Eichler

Evolution of tissue-specific expression of ancestral genes across vertebrates and insects

Article 15 April 2024

Federica Mantica, Luis P. Iñiguez, … Manuel Irimia

Single-cell multiplex chromatin and RNA interactions in ageing human brain

Article Open access 27 March 2024

Xingzhao Wen, Zhifei Luo, … Sheng Zhong

Data availability

The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD031991.

Code availability

Code for reproducing the analysis is available in our laboratory Github repository (https://github.com/sebepedroslab/chromatin-evolution-analysis).

References

Struhl, K. Fundamentally different logic of gene regulation in eukaryotes and prokaryotes. Cell 98, 1–4 (1999).
Article CAS PubMed Google Scholar
Kornberg, R. D. & Lorch, Y. Primary role of the nucleosome. Mol. Cell 79, 371–375 (2020).
Article CAS PubMed Google Scholar
Jenuwein, T. & Allis, C. D. Translating the histone code. Science 293, 1074–1080 (2001).
Article CAS PubMed Google Scholar
Berger, S. L. The complex language of chromatin regulation during transcription. Nature 447, 407–412 (2007).
Article CAS PubMed Google Scholar
Banaszynski, L. A., Allis, C. D. & Lewis, P. W. Histone variants in metazoan development. Dev. Cell 19, 662–674 (2010).
Article CAS PubMed PubMed Central Google Scholar
Allis, C. D. & Jenuwein, T. The molecular hallmarks of epigenetic control. Nat. Rev. Genet. 17, 487–500 (2016).
Article CAS PubMed Google Scholar
Sultana, T. et al. The landscape of L1 retrotransposons in the human genome is shaped by pre-insertion sequence biases and post-insertion selection. Mol. Cell 74, 555–570 (2019).
Article CAS PubMed Google Scholar
Gangadharan, S., Mularoni, L., Fain-Thornton, J., Wheelan, S. J. & Craig, N. L. DNA transposon Hermes inserts into DNA in nucleosome-free regions in vivo. Proc. Natl Acad. Sci. USA 107, 21966–21972 (2010).
Article CAS PubMed PubMed Central Google Scholar
Shinn, P. et al. HIV-1 integration in the human genome favors active genes and local hotspots. Cell 110, 521–529 (2002).
Article PubMed Google Scholar
Goodier, J. L. Restricting retrotransposons: a review. Mob. DNA 7, 16 (2016).
Article PubMed PubMed Central Google Scholar
Molaro, A. & Malik, H. S. Hide and seek: how chromatin-based pathways silence retroelements in the mammalian germline. Curr. Opin. Genet. Dev. 37, 51–58 (2016).
Article CAS PubMed PubMed Central Google Scholar
Malik, H. S. & Henikoff, S. Phylogenomics of the nucleosome. Nat. Struct. Biol. 10, 882–891 (2003).
Article CAS PubMed Google Scholar
Talbert, P. B. & Henikoff, S. Histone variants—ancient wrap artists of the epigenome. Nat. Rev. Mol. Cell Biol. 11, 264–275 (2010).
Article CAS PubMed Google Scholar
Soboleva, T. A., Nekrasov, M., Ryan, D. P. & Tremethick, D. J. Histone variants at the transcription start-site. Trends Genet. 30, 199–209 (2014).
Article CAS PubMed Google Scholar
Zink, L.-M. & Hake, S. B. Histone variants: nuclear function and disease. Curr. Opin. Genet. Dev. 37, 82–89 (2016).
Article CAS PubMed Google Scholar
Weber, C. M. & Henikoff, S. Histone variants: dynamic punctuation in transcription. Genes Dev. 28, 672–682 (2014).
Article CAS PubMed PubMed Central Google Scholar
Borg, M., Jiang, D. & Berger, F. Histone variants take center stage in shaping the epigenome. Curr. Opin. Plant Biol. 61, 101991 (2021).
Article CAS PubMed Google Scholar
Zentner, G. E. & Henikoff, S. Regulation of nucleosome dynamics by histone modifications. Nat. Struct. Mol. Biol. 20, 259–266 (2013).
Article CAS PubMed Google Scholar
Campos, E. I. & Reinberg, D. Histones: annotating chromatin. Annu. Rev. Genet. 43, 559–599 (2009).
Article CAS PubMed Google Scholar
Strahl, B. D. & Allis, C. D. The language of covalent histone modifications. Nature 403, 41–45 (2000).
Article CAS PubMed Google Scholar
Bannister, A. J. & Kouzarides, T. Regulation of chromatin by histone modifications. Cell Res. 21, 381–395 (2011).
Article CAS PubMed PubMed Central Google Scholar
Talbert, P. B. & Henikoff, S. The yin and yang of histone marks in transcription. Annu. Rev. Genom. Hum. Genet. 22, 147–170 (2021).
Article Google Scholar
Taverna, S. D., Li, H., Ruthenburg, A. J., Allis, C. D. & Patel, D. J. How chromatin-binding modules interpret histone modifications: lessons from professional pocket pickers. Nat. Struct. Mol. Biol. 14, 1025–1040 (2007).
Article CAS PubMed PubMed Central Google Scholar
Musselman, C. A., Lalonde, M.-E., Côté, J. & Kutateladze, T. G. Perceiving the epigenetic landscape through histone readers. Nat. Struct. Mol. Biol. 19, 1218–1227 (2012).
Article CAS PubMed PubMed Central Google Scholar
Gurard-Levin, Z. A., Quivy, J.-P. & Almouzni, G. Histone chaperones: assisting histone traffic and nucleosome dynamics. Annu. Rev. Biochem. 83, 487–517 (2014).
Article CAS PubMed Google Scholar
Burgess, R. J. & Zhang, Z. Histone chaperones in nucleosome assembly and human disease. Nat. Struct. Mol. Biol. 20, 14–22 (2013).
Article CAS PubMed PubMed Central Google Scholar
Koster, M. J. E., Snel, B. & Timmers, H. T. M. Genesis of chromatin and transcription dynamics in the origin of species. Cell 161, 724–736 (2015).
Article CAS PubMed Google Scholar
Hargreaves, D. C. & Crabtree, G. R. ATP-dependent chromatin remodeling: genetics, genomics and mechanisms. Cell Res. 21, 396–420 (2011).
Article CAS PubMed PubMed Central Google Scholar
Gornik, S. G. et al. Loss of nucleosomal DNA condensation coincides with appearance of a novel nuclear protein in dinoflagellates. Curr. Biol. 22, 2303–2312 (2012).
Article CAS PubMed Google Scholar
Mattiroli, F. et al. Structure of histone-based chromatin in Archaea. Science 357, 609–612 (2017).
Article CAS PubMed PubMed Central Google Scholar
Warnecke, T., Becker, E. A., Facciotti, M. T., Nislow, C. & Lehner, B. Conserved substitution patterns around nucleosome footprints in eukaryotes and Archaea derive from frequent nucleosome repositioning through evolution. PLoS Comput. Biol. 9, e1003373 (2013).
Article PubMed PubMed Central Google Scholar
Ammar, R. et al. Chromatin is an ancient innovation conserved between Archaea and Eukarya. eLife 1, e00078 (2012).
Article PubMed PubMed Central Google Scholar
Rojec, M., Hocher, A., Merkenschlager, M. & Warnecke, T. Chromatinization of Escherichia coli with archaeal histones.eLife 8, e49038 (2019).
Article CAS PubMed PubMed Central Google Scholar
Forbes, A. J. et al. Targeted analysis and discovery of posttranslational modifications in proteins from methanogenic archaea by top-down MS. Proc. Natl Acad. Sci. USA 101, 2678–2683 (2004).
Article CAS PubMed PubMed Central Google Scholar
Weidenbach, K. et al. Deletion of the archaeal histone in Methanosarcina mazei Gö1 results in reduced growth and genomic transcription. Mol. Microbiol. 67, 662–671 (2008).
Article CAS PubMed Google Scholar
Talbert, P. B., Meers, M. P. & Henikoff, S. Old cogs, new tricks: the evolution of gene expression in a chromatin context. Nat. Rev. Genet. 20, 283–297 (2019).
Article CAS PubMed Google Scholar
de Mendoza, A. & Sebe-Pedros, A. Origin and evolution of eukaryotic transcription factors. Curr. Opin. Genet. Dev. 59, 25–32 (2019).
Article Google Scholar
Schwaiger, M. et al. Evolutionary conservation of the eumetazoan gene regulatory landscape. Genome Res. 24, 639–650 (2014).
Article CAS PubMed PubMed Central Google Scholar
Sebé-Pedrós, A. et al. Early metazoan cell type diversity and the evolution of multicellular gene regulation. Nat. Ecol. Evol. 2, 1176–1188 (2018).
Article PubMed PubMed Central Google Scholar
Connolly, L. R., Smith, K. M. & Freitag, M. The Fusarium graminearum histone H3 K27 methyltransferase KMT6 regulates development and expression of secondary metabolite gene clusters. PLoS Genet. 9, e1003916 (2013).
Article PubMed PubMed Central Google Scholar
Jamieson, K., Rountree, M. R., Lewis, Z. A., Stajich, J. E. & Selker, E. U. Regional control of histone H3 lysine 27 methylation in Neurospora. Proc. Natl Acad. Sci. USA 110, 6027–6032 (2013).
Article CAS PubMed PubMed Central Google Scholar
Sebé-Pedrós, A. et al. The dynamic regulatory genome of Capsaspora and the origin of animal multicellularity. Cell 165, 1224–1237 (2016).
Article PubMed PubMed Central Google Scholar
Bourdareau, S. et al. Histone modifications during the life cycle of the brown alga Ectocarpus. Genome Biol. 22, 12 (2021).
Article CAS PubMed PubMed Central Google Scholar
Wang, S. Y. et al. Role of epigenetics in unicellular to multicellular transition in Dictyostelium. Genome Biol. 22, 134 (2021).
Article PubMed PubMed Central Google Scholar
Taverna, S. D., Coyne, R. S. & Allis, C. D. Methylation of histone H3 at lysine 9 targets programmed DNA elimination in Tetrahymena. Cell 110, 701–711 (2002).
Article CAS PubMed Google Scholar
Garcia, B. A. et al. Organismal differences in post-translational modifications in histones H3 and H4. J. Biol. Chem. 282, 7641–7655 (2007).
Article CAS PubMed Google Scholar
Drinnenberg, I. A. et al. EvoChromo: towards a synthesis of chromatin biology and evolution. Development 146, dev178962 (2019).
Article CAS PubMed PubMed Central Google Scholar
Draizen, E. J. et al. HistoneDB 2.0: a histone database with variants—an integrated resource to explore histones and their variants. Database 2016, baw014 (2016).
Article PubMed PubMed Central Google Scholar
Maile, T. M. et al. Mass spectrometric quantification of histone post-translational modifications by a hybrid chemical labeling method. Mol. Cell. Proteom. 14, 1148–1158 (2015).
Article CAS Google Scholar
Li, B., Carey, M. & Workman, J. L. The role of chromatin during transcription. Cell 128, 707–719 (2007).
Article CAS PubMed Google Scholar
Rajagopal, N. et al. Distinct and predictive histone lysine acetylation patterns at promoters, enhancers, and gene bodies. Genes Genomes Genet. 4, 2051–2063 (2014).
Google Scholar
Koonin, E. V. & Yutin, N. The dispersed archaeal eukaryome and the complex archaeal ancestor of eukaryotes. Cold Spring Harb. Perspect. Biol. 6, a016188–a016188 (2014).
Article PubMed PubMed Central Google Scholar
Sandman, K. & Reeve, J. N. Archaeal histones and the origin of the histone fold. Curr. Opin. Microbiol. 9, 520–525 (2006).
Article CAS PubMed Google Scholar
Pereira, S. L., Grayling, R. A., Lurz, R. & Reeve, J. N. Archaeal nucleosomes. Proc. Natl Acad. Sci. USA 94, 12633–12637 (1997).
Article CAS PubMed PubMed Central Google Scholar
Henneman, B., van Emmerik, C., van Ingen, H. & Dame, R. T. Structure and function of archaeal histones. PLoS Genet. 14, e1007582 (2018).
Article PubMed PubMed Central Google Scholar
Imachi, H. et al. Isolation of an archaeon at the prokaryote-eukaryote interface. Nature 577, 519–525 (2020).
Article CAS PubMed PubMed Central Google Scholar
Spang, A. et al. Complex archaea that bridge the gap between prokaryotes and eukaryotes. Nature 521, 173–179 (2015).
Article CAS PubMed PubMed Central Google Scholar
Zaremba-Niedzwiedzka, K. et al. Asgard archaea illuminate the origin of eukaryotic cellular complexity. Nature 541, 353–358 (2017).
Article CAS PubMed Google Scholar
Da Cunha, V., Gaia, M., Nasir, A. & Forterre, P. Asgard archaea do not close the debate about the universal tree of life topology. PLoS Genet. 14, e1007215 (2018).
Article PubMed PubMed Central Google Scholar
Alva, V. & Lupas, A. N. Histones predate the split between bacteria and archaea. Bioinformatics 35, 2349–2353 (2019).
Article CAS PubMed Google Scholar
Allis, C. D. et al. New nomenclature for chromatin-modifying enzymes. Cell 131, 633–636 (2007).
Article CAS PubMed Google Scholar
Wu, F. et al. Unique mobile elements and scalable gene flow at the prokaryote–eukaryote boundary revealed by circularized Asgard archaea genomes. Nat. Microbiol. 7, 200–212 (2022).
Article CAS PubMed PubMed Central Google Scholar
Schuettengruber, B., Bourbon, H.-M., Di Croce, L. & Cavalli, G. Genome regulation by polycomb and trithorax: 70 years and counting. Cell 171, 34–57 (2017).
Article CAS PubMed Google Scholar
Dion, M. F., Altschuler, S. J., Wu, L. F. & Rando, O. J. Genomic characterization reveals a simple histone H4 acetylation code. Proc. Natl Acad. Sci. USA 102, 5501–5506 (2005).
Article CAS PubMed PubMed Central Google Scholar
de Jong, J. et al. Chromatin landscapes of retroviral and transposon integration profiles. PLoS Genet. 10, e1004250 (2014).
Article PubMed PubMed Central Google Scholar
Sultana, T., Zamborlini, A., Cristofari, G. & Lesage, P. Integration site selection by retroviruses and transposable elements in eukaryotes. Nat. Rev. Genet. 18, 292–308 (2017).
Article CAS PubMed Google Scholar
Gao, X., Hou, Y., Ebina, H., Levin, H. L. & Voytas, D. F. Chromodomains direct integration of retrotransposons to heterochromatin. Genome Res. 18, 359–369 (2008).
Article CAS PubMed PubMed Central Google Scholar
Cosby, R. L. et al. Recurrent evolution of vertebrate transcription factors by transposase capture. Science 371, eabc6405 (2021).
Article CAS PubMed PubMed Central Google Scholar
Cordaux, R., Udit, S., Batzer, M. A. & Feschotte, C. Birth of a chimeric primate gene by capture of the transposase gene from a mobile element. Proc. Natl Acad. Sci. USA 103, 8101–8106 (2006).
Article CAS PubMed PubMed Central Google Scholar
Fiedler, M. et al. Decoding of methylated histone H3 tail by the Pygo-BCL9 Wnt signaling complex. Mol. Cell 30, 507–518 (2008).
Article CAS PubMed PubMed Central Google Scholar
Erives, A. J. Phylogenetic analysis of the core histone doublet and DNA topo II genes of Marseilleviridae: evidence of proto-eukaryotic provenance. Epigenetics Chromatin 10, 55 (2017).
Article PubMed PubMed Central Google Scholar
Liu, Y. et al. Virus-encoded histone doublets are essential and form nucleosome-like structures. Cell 184, 4237–4250 (2021).
Article CAS PubMed PubMed Central Google Scholar
Valencia-Sánchez, M. I. et al. The structure of a virus-encoded nucleosome. Nat. Struct. Mol. Biol. 28, 413–417 (2021).
Article PubMed PubMed Central Google Scholar
Iyer, L. M., Balaji, S., Koonin, E. V. & Aravind, L. Evolutionary genomics of nucleo-cytoplasmic large DNA viruses. Virus Res. 117, 156–184 (2006).
Article CAS PubMed Google Scholar
Nagamine, T. Apoptotic arms races in insect-baculovirus coevolution. Physiol. Entomol. 47, 1–10 (2021).
Article Google Scholar
Starrett, G. J. et al. Adintoviruses: a proposed animal-tropic family of midsize eukaryotic linear dsDNA (MELD) viruses. Virus Evol. 7, veaa055 (2021).
Article PubMed PubMed Central Google Scholar
Hocher, A. et al. Growth temperature is the principal driver of chromatinization in archaea. Preprint at bioRxiv https://doi.org/10.1101/2021.07.08.451601 (2021).
Alpha-Bazin, B. et al. Lysine-specific acetylated proteome from the archaeon Thermococcus gammatolerans reveals the presence of acetylated histones. J. Proteom. 232, 104044 (2021).
Article CAS Google Scholar
Eme, L., Spang, A., Lombard, J., Stairs, C. W. & Ettema, T. J. G. Archaea and the origin of eukaryotes. Nat. Rev. Microbiol. 15, 711–723 (2017).
Article CAS PubMed Google Scholar
Akıl, C. & Robinson, R. C. Genomes of Asgard archaea encode profilins that regulate actin. Nature 562, 439–443 (2018).
Article PubMed Google Scholar
Koonin, E. V. The origin and early evolution of eukaryotes in the light of phylogenomics. Genome Biol 11, 209 (2010).
Article PubMed PubMed Central Google Scholar
Sebé-Pedrós, A., Grau-Bové, X., Richards, T. A. & Ruiz-Trillo, I. Evolution and classification of myosins, a paneukaryotic whole-genome approach. Genome Biol. Evol. 6, 290–305 (2014).
Article PubMed PubMed Central Google Scholar
Richards, T. A. & Cavalier-Smith, T. Myosin domain evolution and the primary divergence of eukaryotes. Nature 436, 1113–1118 (2005).
Article CAS PubMed Google Scholar
Wickstead, B., Gull, K. & Richards, T. Patterns of kinesin evolution reveal a complex ancestral eukaryote with a multifunctional cytoskeleton. BMC Evol. Biol. 10, 110 (2010).
Article PubMed PubMed Central Google Scholar
Dacks, J. B. & Field, M. C. Evolution of the eukaryotic membrane-trafficking system: origin, tempo and mode. J. Cell Sci. 120, 2977–2985 (2007).
Article CAS PubMed Google Scholar
Collins, L. & Penny, D. Complex spliceosomal organization ancestral to extant eukaryotes. Mol. Biol. Evol. 22, 1053–1066 (2005).
Article CAS PubMed Google Scholar
Grau-Bové, X., Sebé-Pedrós, A. & Ruiz-Trillo, I. The eukaryotic ancestor had a complex ubiquitin signaling system of archaeal origin. Mol. Biol. Evol. 32, 726–739 (2015).
Article PubMed Google Scholar
Kundaje, A. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Article CAS PubMed PubMed Central Google Scholar
Ho, J. W. K. et al. Comparative analysis of metazoan chromatin organization. Nature 512, 449–452 (2014).
Article CAS PubMed PubMed Central Google Scholar
Montgomery, S. A. et al. Chromatin organization in early land plants reveals an ancestral association between H3K27me3, transposons, and constitutive heterochromatin. Curr. Biol. 30, 573–588 (2020).
Article CAS PubMed PubMed Central Google Scholar
Frapporti, A. et al. The Polycomb protein Ezl1 mediates H3K9 and H3K27 methylation to repress transposable elements in Paramecium. Nat. Commun. 10, 2710 (2019).
Article PubMed PubMed Central Google Scholar
Lennartsson, A. & Ekwall, K. Histone modification patterns and epigenetic codes. Biochim. Biophys. Acta 1790, 863–868 (2009).
Article CAS PubMed Google Scholar
Peterson, C. L. & Laniel, M.-A. Histones and histone modifications. Curr. Biol. 14, R546–R551 (2004).
Article CAS PubMed Google Scholar
Rando, O. J. Combinatorial complexity in chromatin structure and function: revisiting the histone code. Curr. Opin. Genet. Dev. 22, 148–155 (2012).
Article CAS PubMed PubMed Central Google Scholar
de Mendoza, A., Pflueger, J. & Lister, R. Capture of a functionally active methyl-CpG binding domain by an arthropod retrotransposon family. Genome Res. 29, 1277–1286 (2019).
Article PubMed PubMed Central Google Scholar
De Mendoza, A. et al. Recurrent acquisition of cytosine methyltransferases into eukaryotic retrotransposons. Nat. Commun. 9, 1341 (2018).
Article PubMed PubMed Central Google Scholar
Ji, X. et al. Chromatin proteomic profiling reveals novel proteins associated with histone-marked genomic regions. Proc. Natl Acad. Sci. USA 112, 3841–3846 (2015).
Article CAS PubMed PubMed Central Google Scholar
Wierer, M. & Mann, M. Proteomics to study DNA-bound and chromatin-associated gene regulatory complexes. Hum. Mol. Genet. 25, R106–R114 (2016).
Article CAS PubMed PubMed Central Google Scholar
Villaseñor, R. et al. ChromID identifies the protein interactome at chromatin marks. Nat. Biotechnol. 38, 728–736 (2020).
Article PubMed PubMed Central Google Scholar
Stieglmeier, M. et al. Nitrososphaera viennensis gen. nov., sp. nov., an aerobic and mesophilic, ammonia-oxidizing archaeon from soil and a member of the archaeal phylum Thaumarchaeota. Int. J. Syst. Evol. Microbiol. 64, 2738–2752 (2014).
Article CAS PubMed PubMed Central Google Scholar
Tirichine, L. et al. Histone extraction protocol from the two model diatoms Phaeodactylum tricornutum and Thalassiosira pseudonana. Mar. Genomics 13, 21–25 (2014).
Article PubMed Google Scholar
Shechter, D., Dormann, H. L., Allis, C. D. & Hake, S. B. Extraction, purification and analysis of histones. Nat. Protoc. 2, 1445–1457 (2007).
Article CAS PubMed Google Scholar
Perkins, D. N., Pappin, D. J. C., Creasy, D. M. & Cottrell, J. S. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551–3567 (1999).
Article CAS PubMed Google Scholar
Taus, T. et al. Universal and confident phosphorylation site localization using phosphoRS. J. Proteome Res. 10, 5354–5362 (2011).
Article CAS PubMed Google Scholar
Vizcaíno, J. A. et al. 2016 update of the PRIDE database and its related tools. Nucleic Acids Res. 44, D447–D456 (2016).
Article PubMed Google Scholar
Hagberg, A. A., Schult, D. A. & Swart, P. J. Exploring network structure, dynamics, and function using NetworkX. In Proc. 7th Python in Science Conference (eds Varoquaux, G. et al.) 11–15 (Python in Science Conference, 2008).
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).
Article CAS PubMed PubMed Central Google Scholar
Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).
Article CAS PubMed Google Scholar
Veluchamy, A. et al. An integrative analysis of post-translational histone modifications in the marine diatom Phaeodactylum tricornutum. Genome Biol. 16, 102 (2015).
Article PubMed PubMed Central Google Scholar
Ren, Q. & Gorovsky, M. A. Histone H2A.Z acetylation modulates an essential charge patch. Mol. Cell 7, 1329–1335 (2001).
Article CAS PubMed Google Scholar
Allis, C. D. et al. hv1 is an evolutionarily conserved H2A variant that is preferentially associated with active genes. J. Biol. Chem. 261, 1941–1948 (1986).
Article CAS PubMed Google Scholar
Fusauchi, Y. & Iwai, K. Tetrahymena histone H2A. Acetylation in the N-terminal sequence and phosphorylation in the C-terminal sequence. J. Biochem. 95, 147–154 (1984).
Article CAS PubMed Google Scholar
Xiong, L., Adhvaryu, K. K., Selker, E. U. & Wang, Y. Mapping of lysine methylation and acetylation in core histones of Neurospora crassa. Biochemistry 49, 5236–5243 (2010).
Article CAS PubMed Google Scholar
Zhang, K., Sridhar, V. V., Zhu, J., Kapoor, A. & Zhu, J. K. Distinctive core histone post-translational modification patterns in Arabidopsis thaliana. PLoS ONE 2, e1210 (2007).
Article PubMed PubMed Central Google Scholar
Johnson, L. et al. Mass spectrometry analysis of Arabidopsis histone H3 reveals distinct combinations of post-translational modifications. Nucleic Acids Res. 32, 6511–6518 (2004).
Article CAS PubMed PubMed Central Google Scholar
Bergmüller, E., Gehrig, P. M. & Gruissem, W. Characterization of post-translational modifications of histone H2B-variants isolated from Arabidopsis thaliana. J. Proteome Res. 6, 3655–3668 (2007).
Article PubMed Google Scholar
Beck, H. C. et al. Quantitative proteomic analysis of post-translational modifications of human histones. Mol. Cell. Proteom. 5, 1314–1325 (2006).
Article CAS Google Scholar
Goudarzi, A. et al. Dynamic competing histone H4 K5K8 acetylation and butyrylation are hallmarks of highly active gene promoters. Mol. Cell 62, 169–180 (2016).
Article CAS PubMed PubMed Central Google Scholar
Hake, S. B. et al. Expression patterns and post-translational modifications associated with mammalian histone H3 variants. J. Biol. Chem. 281, 559–568 (2006).
Article CAS PubMed Google Scholar
Tan, M. et al. Identification of 67 histone marks and histone lysine crotonylation as a new type of histone modification. Cell 146, 1016–1028 (2011).
Article CAS PubMed PubMed Central Google Scholar
Moniruzzaman, M., Martinez-Gutierrez, C. A., Weinheimer, A. R. & Aylward, F. O. Dynamic genome evolution and complex virocell metabolism of globally-distributed giant viruses. Nat. Commun. 11, 2020 (1710).
Google Scholar
Punta, M. et al. The Pfam protein families database. Nucleic Acids Res. 40, D290–D301 (2012).
Article CAS PubMed Google Scholar
Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).
Article CAS PubMed PubMed Central Google Scholar
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).
Article CAS PubMed Google Scholar
Enright, A. J., Van Dongen, S. & Ouzounis, C. A. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30, 1575–1584 (2002).
Article CAS PubMed PubMed Central Google Scholar
Steenwyk, J. L., Buida, T. J., Li, Y., Shen, X.-X. & Rokas, A. ClipKIT: a multiple sequence alignment trimming software for accurate phylogenomic inference. PLoS Biol. 18, e3001007 (2020).
Article CAS PubMed PubMed Central Google Scholar
Minh, B. Q., Nguyen, M. A. T. & von Haeseler, A. Ultrafast approximation for phylogenetic bootstrap. Mol. Biol. Evol. 30, 1188–1195 (2013).
Article CAS PubMed PubMed Central Google Scholar
Grau-Bové, X. & Sebé-Pedrós, A. Orthology clusters from gene trees with Possvm. Mol. Biol. Evol. 38, 5204–5208 (2021).
Article PubMed PubMed Central Google Scholar
Huerta-Cepas, J., Dopazo, H., Dopazo, J. & Gabaldón, T. The human phylome. Genome Biol. 8, R109 (2007).
Article PubMed PubMed Central Google Scholar
Huerta-Cepas, J., Serra, F. & Bork, P. ETE 3: reconstruction, analysis, and visualization of phylogenomic data. Mol. Biol. Evol. 33, 1635–1638 (2016).
Article CAS PubMed PubMed Central Google Scholar
Csűrös, M. & Miklós, I. in Research in Computational Molecular Biology (eds Apostolico, A. et al.) 206–220 (Springer, 2006).
Csurös, M. Count: evolutionary analysis of phylogenetic profiles with parsimony and likelihood. Bioinformatics 26, 1910–1912 (2010).
Article PubMed Google Scholar
Gansner, E. R. & North, S. C. An open graph visualization system and its applications to software engineering. Softw. Pract. Exp. 30, 1203–1233 (2000).
Article Google Scholar
Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).
Article CAS PubMed PubMed Central Google Scholar
Jombart, T., Balloux, F. & Dray, S. adephylo: new tools for investigating the phylogenetic signal in biological traits. Bioinformatics 26, 1907–1909 (2010).
Article CAS PubMed Google Scholar
Wells, J. N. & Feschotte, C. A field guide to eukaryotic transposable elements. Annu. Rev. Genet. 54, 539–561 (2020).
Article CAS PubMed PubMed Central Google Scholar
Storer, J., Hubley, R., Rosen, J., Wheeler, T. J. & Smit, A. F. The Dfam community resource of transposable element families, sequence models, and genome annotations. Mob. DNA 12, 2 (2021).
Article CAS PubMed PubMed Central Google Scholar
Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinf. 10, 421 (2009).
Article Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS PubMed PubMed Central Google Scholar
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Article CAS PubMed PubMed Central Google Scholar
Anisimova, M., Gil, M., Dufayard, J.-F., Dessimoz, C. & Gascuel, O. Survey of branch support methods demonstrates accuracy, power, and robustness of fast likelihood-based approximation schemes. Syst. Biol. 60, 685–699 (2011).
Article PubMed PubMed Central Google Scholar
Bodenhofer, U., Bonatesta, E., Horejš-Kainrath, C. & Hochreiter, S. msa: an R package for multiple sequence alignment. Bioinformatics 31, 3997–3999 (2015).
CAS PubMed Google Scholar

Download references

Acknowledgements

We thank A. de Mendoza for critical input on the analysis of TE fusions. We also thank J. Casacuberta for P. patens samples, H. J. G. Meijer for P. infestans samples, M. Adamska for S. ciliatum samples and A. Simpson for access to the G. okellyi culture (made possible by his funding from NSERC, Canada). Research in the A.S.-P. group was supported by the European Research Council (ERC) under the European Union’s Horizon 2020 Research and Innovation Programme (grant agreement no. 851647) and the Spanish Ministry of Science and Innovation (PGC2018-098210-A-I00). We also acknowledge support of the Spanish Ministry of Science and Innovation to the EMBL partnership, the Centro de Excelencia Severo Ochoa and the CERCA Programme (Generalitat de Catalunya). C.N. is supported by an FPI PhD fellowship from the Spanish Ministry of Economy, Industry and Competitiveness (MEIC). X.G.-B. is supported by a Juan de la Cierva fellowship (FJC2018-036282-I) from MEIC. I.R.-T. was supported by a European Research Council (grant no. 616960). B.F.L. was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC; RGPIN-2017-05411) and by the ‘Fonds de Recherche Nature et Technologie’, Quebec. P.L.-G. and D.M. were supported by a Moore and Simons foundations grant (GBMF9739) and by European Research Council advanced grants (322669, 787904). Research in the C.S. group was supported by the ERC through project TACKLE (advanced grant no. 695192).

Author information

Authors and Affiliations

Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
Xavier Grau-Bové, Cristina Navarrete, Eduard Sabidó & Arnau Sebé-Pedrós
Universitat Pompeu Fabra (UPF), Barcelona, Spain
Xavier Grau-Bové, Cristina Navarrete, Cristina Chiva, Eduard Sabidó & Arnau Sebé-Pedrós
Department of Functional and Evolutionary Ecology, Archaea Biology Unit, University of Vienna, Vienna, Austria
Thomas Pribasnig & Christa Schleper
Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Barcelona, Spain
Meritxell Antó & Iñaki Ruiz-Trillo
Unité d’Ecologie Systématique et Evolution, CNRS, Université Paris-Saclay, AgroParisTech, Orsay, France
Guifré Torruella, Luis Javier Galindo, David Moreira & Purificación López-Garcia
Robert Cedergren Centre in Bioinformatics and Genomics, Department of Biochemistry, Université de Montréal, Montréal, Quebec, Canada
Bernd Franz Lang
Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain
Iñaki Ruiz-Trillo

Authors

Xavier Grau-Bové
View author publications
You can also search for this author in PubMed Google Scholar
Cristina Navarrete
View author publications
You can also search for this author in PubMed Google Scholar
Cristina Chiva
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Pribasnig
View author publications
You can also search for this author in PubMed Google Scholar
Meritxell Antó
View author publications
You can also search for this author in PubMed Google Scholar
Guifré Torruella
View author publications
You can also search for this author in PubMed Google Scholar
Luis Javier Galindo
View author publications
You can also search for this author in PubMed Google Scholar
Bernd Franz Lang
View author publications
You can also search for this author in PubMed Google Scholar
David Moreira
View author publications
You can also search for this author in PubMed Google Scholar
Purificación López-Garcia
View author publications
You can also search for this author in PubMed Google Scholar
Iñaki Ruiz-Trillo
View author publications
You can also search for this author in PubMed Google Scholar
Christa Schleper
View author publications
You can also search for this author in PubMed Google Scholar
Eduard Sabidó
View author publications
You can also search for this author in PubMed Google Scholar
Arnau Sebé-Pedrós
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.S.-P. conceived the project. X.G.-B., C.C., I.R.-T., C.S., E.S. and A.S.-P. designed experiments and analytical strategies. C.N., T.P., M.A. and A.S.-P. performed experiments. X.G.-B., C.C. and A.S.-P. analysed the data. T.P., G.T., L.J.G., D.M., P.L.-G. and B.F.L. provided biological samples/cultures and genomic data. All authors contributed to data interpretation. X.G.-B. and A.S.-P. wrote the manuscript with input from all authors.

Corresponding author

Correspondence to Arnau Sebé-Pedrós.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Ecology & Evolution thanks Paul Talbert, Michael Borg and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Histone classification and evolution.

a, Primary and secondary alignments of histone-fold containing proteins classified as canonical H2A, H2B, H3 and H4, based on identity to reference sequences in HistoneDB. Pie plots represent the number of alignments to HistoneDB-annotated sequences, for the entire dataset (prokaryotic, eukary-otic and viral sequences, large pie plots in the inset) and the eukaryotic subset (smaller plots in the inset). For those proteins that align to more than one canonical histone or major variant (macroH2A, H2A.Z or cenH3), the scatter plots represent the relative identity between the primary (horizontal axis) and secondary alignment(s) (vertical axis). b, Aggregated counts of histone gene pairs, classified ac-cording to histone type and orientation. c, Presence of histone variants (left) and number of collinear pairs of histone-encoding genes (right) per species, classified according to their histone types and rela-tive orientation (head-to-head, hh; head-to-tail, ht; and tail-to-tail, tt). Source data available in Supple-mentary Data 2. Histone variant classification is based on the highest-scoring HMM profile from His-toneDB. Asterisks colors in the macroH2A column indicate species where histone-less Macro do-mains orthologous to the macroH2A genes are found (see panel d). Lighter colors in the variant classi-fication indicate ambiguously classified histones (i.e. cases in which the highest-scoring HMM profile exhibited a low bitscore, defined as a probability below 0.05 in the profile-wise distribution function of scaled bitscores; or cases in which the first-to-second ratio between high scoring profiles was below 1.01). d, Alignments of putatively conserved histone N-tails in archaea. Conserved amino-acids are color-coded according to chemical properties. Dots next to species names are color-coded according to taxonomy (same as Fig. 2c). e, Phylogenetic analysis of the Macro motif of macroH2A histones across eukaryotes, highlighting the macroH2A ortholog group (green), and, within this group, Macro-containing genes lacking histone domains (orange), and their protein domain architectures.

Extended Data Fig. 2 Histone post-translational modifications.

a, Proteomics detection coverage (% of amino acids), number of hPTMs and number of hPTMs per covered position, for the best-covered histone in each species in our proteomics survey. b, Number of samples in which each histone-matching peptide with post-translational modifications (peptide spectral matches defined by Proteome Discoverer) has been identified, per species. For each species, we report the percentage of modified peptides found in more than one replicate. c, Number of samples in which histone-matching modified peptide has been identified, across all the samples from this study. The tree pie charts represent these distributions for all hPTMs, acetylations, and methylations. d, Evidence of hPTM conservation in the major histone variants H2A.Z and macroH2A (conserved positions only), as well as any position in the linker histones H1.

Extended Data Fig. 3 Gene family counts.

a-c, Number of taxa within each lineage that contain chromatin-associated genes, for archaeal, bacterial (per phyla) or viral (per family) genomes. Numbers indicate the exact number of taxa. d, Number of genes encoding core domains that define chromatin-associated gene families per eukaryotic genome/transcriptome. Numbers indicate exact number of proteins.

Extended Data Fig. 4 Evolutionary reconstruction and domain architecture conservation.

a, Species tree of eukaryotes used in the ancestral reconstruction analysis, with branch lengths calibrated to the gain/loss rates of Pfam domains (see Methods). Available in Supplementary Table 1. b, Conservation of archetypical protein domain architectures across orthogroups, in acetylases, deacetylases, methyltransferases, demethylases, remodellers and chaperones. In each heatmap, we indicate the fraction of genes within an orthogroup (rows) that contain a specific protein domain (columns). Domains in bold are catalytic (black) or reader (purple) functions. At the right of each heatmap, we summarize the presence/absence profile of each orthogroup across eukaryotic lineages (as listed in Fig. 1a).

Extended Data Fig. 5 Evolution of the hPTM reader toolkit.

a, Pie plot representing the number of genes classified as part of the catalytic (acetylases, deacetylases, methyltransferases, demethylases, remodellers or chaperones) or reader families or as both. The barplot at the right shows the most common reader domains in genes classified with both reader and catalytic functions. b, Pie plot representing the number of reader domain-encoding genes classified according to whether they contain one type of reader domain (for example, PHD) or more than one (for example, PHD + PWWP). The barplot at the right shows the most common combinations of reader domains among genes with multiple reader domains. c, Summary of gene family gains per reader family, with example cases highlighted in selected nodes. Node size is proportional to number of gains at 90% probability.

Extended Data Fig. 6 Transposon–chromatin gene fusions.

a, Number of candidate fusion genes classified by the level of gene model validation evidence, based on contiguity of the gene model over the genome assembly (that is lack of poly-N stretches in the genomic region between the TE- and chromatin-associated domains), evidence of expression, and evidence of contiguous expression (see inset at the right). b, Summary of candidate gene fusions within each chromatin-associated gene family, divided by gene family. For each gene, we indicate their similarity to known TE families, presence of TE-associated domains, the evidence of gene model validity, and information on their gene structure (whether they are monoexonic or are located in clusters with other fusion genes). Source data available in Supplementary Table 6. c, Number of species with at least one valid fusion, divided by gene family. d, Mapping positions of RNA-seq reads supporting candidate gene-transposon fusions (selected examples from Fig. 5e). For each fusion, we show reads spanning the region along the spliced transcript that fully covers the transposon-associated domains (highlighted in green), the chromatin-associated domains, and the inter-domain region. Uninterrupted stretches of mapped positions between domains indicate the validity of a domain co-occurrence. For clarity purposes, reads mapping entirely within a single domain have been excluded from this visualization.

Extended Data Fig. 7 Chromatin proteins in viruses.

a-c, Selected gene trees highlighting examples of eukaryotic- and prokaryotic-like viral homologues. d, Number of viral genes of each chromatin-associated gene family, classified according to their closest neighbours from cellular clades in gene tree analyses based on phylogenetic affinity scores (see Methods). Within each gene family, viral sequences are classified according to their PFAM domain architecture – the most common architecture being single domain in most gene families except for remodellers and BIR readers. e, Id. but classifying viral genes according to their phylogenetic affinity to eukaryotic orthology groups. Source data available in Supplementary Table 6.

Supplementary information

Reporting Summary

Supplementary Data 1

Taxon sampling. a, List of eukaryotic species used in the comparative genomic analyses, including species abbreviations, data sources for genome or transcriptome assemblies and annotations and their taxonomic classification. b, List of gene expression datasets (SRA accession numbers) used for gene model validation analyses of candidate fusion genes. c, List of histone post-translational modification proteomics datasets used in this study (PRIDE accession numbers).

Supplementary Data 2

Histone clusters and classification. a, Pairs of collinear histone-encoding genes, including their genomic coordinates and relative orientation. b, List and sequences of archaeal HMfB histones with N-terminal tails (at least ten amino acids before a complete globular domain). c, Classification of histone variants across eukaryotes.

Supplementary Data 3

hPTM conservation. a–g, Table of hPTMs identified in histones of the 26 eukaryotic species used in the comparative proteomics analysis, separated by histone type (canonical and major variants: H2A, H2B, H3, H4, macroH2A, H2A.Z and H1). Each entry corresponds to a modified peptide, for which we specify modification coordinates along the peptide and relative to the consensus histone sequence (if available). We also indicate whether each peptide can be uniquely mapped to a conserved or non-conserved region in a canonical histone or to specific histone variants. These tables also include entries for hPTMs reported in the literature (indicated as a cited source or as a specific UNIPROT entry; see Methods for a list of sources); in these cases, source peptides and associated data may not be available. h, hPTMs in archaea.

Supplementary Data 4

Gene family analysis. a, List of gene classes analysed in the comparative genomics analyses, including the PFAM protein domains used to retrieve homologues and search parameters. b, List of transposon-associated PFAM domains surveyed in the analyses of transposon–chromatin gene fusions.

Supplementary Data 5

Evolution of the chromatin machinery in eukaryotes. a, Summary of gene family evolutionary patterns in eukaryotes (n = 1,713 orthogroups). For each orthogroup, we indicate its gene and functional class, the number of members, species where it is present and major eukaryotic lineages (Amoebozoa, Opisthokonta + Breviatea + Apusozoa, CRuMs, Ancyromonadida, Malawimonadidae, Archaeplastida + Cryptista, SAR + Haptista, Hemimastigophora, Discoba and Metamonada), the probability of presence at the last eukaryotic common ancestor, the phylogenetic affinity of their closest homologues (other eukaryotic orthogroups, bacteria, archaea or viruses) and their average frequency amongst the ten nearest neighbours of its member gene in phylogenetic trees (‘Phylogenetic affinity score’, Methods); as well as its consensus protein domain architecture (present in at least 25% of its members). We also indicate the gene symbols of members from four model species: H. sapiens, D. melanogaster, S. cerevisiae and A. thaliana. b,c, Probability of gain and loss of each gene family at extant and ancestral nodes along the eukaryotic phylogeny. d, Orthogroup assignments per gene.

Supplementary Data 6

Transposon fusions and viral homology. a, List of candidate fusions between chromatin-associated genes and transposons, including the phylogenetic classification of each gene (orthogroup), protein domain architectures and the transcriptomics-level and gene model-level evidence supporting each fusion. b, List of chromatin-associated genes encoded by viral genomes, including their species of origin and a summary of their phylogenetic embedding among cellular species (specifically, which are its closest homologues in cellular genomes and the fraction of phylogenetic nearest neighbours they represent, the closest eukaryotic gene family among those close to eukaryotic genes in the gene trees and the distance to the closest cellular homologue).

Supplementary Data 7

Phylogenetic analyses. Collection of gene trees used to identify orthology groups for the eukaryotic chromatin toolkit. UFBS bootstrap supports rare indicated at each node. An annotated eukaryotic species tree is also included.

Supplementary Data 8

Peptide sequences. Collection of peptide sequences used to build gene trees of the eukaryotic chromatin toolkit.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Grau-Bové, X., Navarrete, C., Chiva, C. et al. A phylogenetic and proteomic reconstruction of eukaryotic chromatin evolution. Nat Ecol Evol 6, 1007–1023 (2022). https://doi.org/10.1038/s41559-022-01771-6

Download citation

Received: 09 November 2021
Accepted: 21 April 2022
Published: 09 June 2022
Issue Date: July 2022
DOI: https://doi.org/10.1038/s41559-022-01771-6

This article is cited by

Uncoupled evolution of the Polycomb system and deep origin of non-canonical PRC1
- Bastiaan de Potter
- Maximilian W. D. Raas
- Berend Snel
Communications Biology (2023)
A developmental role for the chromatin-regulating CoREST complex in the cnidarian Nematostella vectensis
- James M. Gahan
- Lucas Leclère
- Fabian Rentzsch
BMC Biology (2022)