Elsevier

Journal of Genetics and Genomics

Volume 47, Issue 9, 20 September 2020, Pages 513-521
Journal of Genetics and Genomics

Original research
The human malaria parasite genome is configured into thousands of coexpressed linear regulatory units

https://doi.org/10.1016/j.jgg.2020.08.005Get rights and content

Abstract

The human malaria parasite Plasmodium falciparum thrives in radically different host environments in mosquitoes and humans, with only a limited set of transcription factors. The nature of regulatory elements or their target genes in the P. falciparum genome remains elusive. Here, we found that this eukaryotic parasite uses an efficient way to maximally use genetic and epigenetic regulation to form regulatory units (RUs) during blood infections. Genes located in the same RU tend to have the same pattern of expression over time and are associated with open chromatin along regulatory elements. To precisely define and quantify these RUs, a novel hidden Markov model was developed to capture the regulatory structure in a genome-wide fashion by integrating expression and epigenetic evidence. We successfully identified thousands of RUs and cross-validated with previous findings. We found more genes involved in red blood cell (RBC) invasion located in the same RU as the PfAP2-I (AP2-I) transcription factor, demonstrating that AP2-I is responsible for regulating RBC invasion. Our study has provided a regulatory mechanism for a compact eukaryotic genome and offers new insights into the in vivo transcriptional regulation of the P. falciparum intraerythrocytic stage.

Introduction

Malaria. P. falciparum remains one of the most devastating parasitic human diseases, killing almost a half million people annually (World Malaria Report, 2016). After infection, disease symptoms occur during intraerythrocyte development (IED). A small number of sexual gametocytes are generated during IED. The gametocytes are ingested by the mosquito vector, which is necessary for malaria disease transmission. With the development of next-generation sequencing, a number of studies (Otto et al., 2010; Bulger and Groudine, 2011; López-Barragán et al., 2011) have demonstrated that the transcription of P. falciparum occurs in tightly regulated cascades, during IED and during sexual growth in the mosquito host (Ay et al., 2015). Exploring regulation mechanisms of gene transcription is crucial for understanding the biological processes of Plasmodium because it could provide potential drug targets for malaria treatment.

Over the past few years, accumulating evidence has shown that the transcriptional regulation mechanisms of P. falciparum are largely dependent on a combination of epigenetics and transcription factors (TFs) (Cui and Miao, 2010; Hoeijmakers et al., 2012; Ay et al., 2015). Nearly 50 different histone post-translational modifications have been identified in P. falciparum (Miao et al., 2006; Trelle et al., 2009; Treeck et al., 2011). Compared with other multicellular eukaryotes, a large proportion of P. falciparum genome is observed to be constitutively acetylated (Miao et al., 2006; Lopez-Rubio et al., 2009). Inhibition of histone acetyltransferase and deacetylase activity affects gene expression regulation interfering with parasite growth (Chaal et al., 2010). Compared with the broadly distributed euchromatin marks, the heterochromatin modifications are mostly in repressive clusters containing virulence genes, such as var, pfmc-2tm, rifin, and stevor, (Lopez-Rubio et al. 2007; Jiang et al., 2013). For example, H3K9me3 and heterochromatin protein 1 are only found in subtelomeres and a few genomic regions including the majority of var genes (Lopez-Rubio et al. 2007, 2009; Salcedo-Amaya et al., 2009). H3K36me3 is also mainly present along the entire var gene body, playing a critical role in var gene repression (Jiang et al., 2013).

Broadly distributed activating epigenetic marks along the P. falciparum intergenic regions provide sufficient areas for TF binding. For example, a poised or open chromatin region with the H2A.Z histone marker at the gene 5′ and 3′ ends facilitates the access of TFs (Bártfai et al., 2010). In Plasmodium, TFs have been reported to directly regulate parasite development (Kafsack et al., 2014; Sinha et al., 2014). A notable example is PfAP2-G, a member of the ApiAP2 TF family. The expression levels of PfAP2-G correlate strongly with levels of gametocyte formation in both P. falciparum (Kafsack et al., 2014) and Pberghei (Sinha et al., 2014). Despite the significant importance of TFs in Plasmodium, transcriptional regulation via TFs has not been studied much. Compared with comprehensive histone mark annotation by genome-wide Chromatin Immunoprecipitation Sequencing (ChIP-Seq) experiments (Bártfai et al., 2010; Gupta et al., 2013; Karmodiya et al., 2015), genome-wide TF binding annotation is still limited in Plasmodium genomes. To our knowledge, only AP2-I (Santos et al., 2017) and AP2-O (Kaneko et al., 2015) (an ApiAP2 TF family) binding sites have been detected. Insufficient TF binding annotations have impeded the study of gene regulation networks in Plasmodium. Current understanding of cis-regulatory DNA elements on the whole-genome scale is still mainly based on assigning a gene to its nearest TF-binding motif (Campbell et al., 2010; Kafsack et al., 2014; Wang et al., 2016).

An open or relatively ‘accessible’ chromatin environment is associated with the binding of specific trans-factors. Formaldehyde-assisted isolation of regulatory elements has been implemented to screen the chromatin accessibility in P. falciparum (Ponts et al., 2010). However, these data were not able to provide sufficient resolution to detect the regulatory elements. The assay for transposase-accessible chromatin using sequencing (ATAC-Seq) provides a high signal-to-noise ratio and has been wildly used because of its fast and straightforward experimental protocol (Buenrostro et al., 2013).

The combination of RNA-Seq and ATAC-Seq on eight tightly synchronized stages during P. falciparum IED (Ruiz et al., 2018; Toenhake et al., 2018) provides an ideal data set to profile gene regulatory events. Considering little evidence for enhancers exists for the P. falciparum genome, assigning this regulatory element to the nearest downstream gene ought to be sufficient. If head-to-head genes are located on either side of a regulatory element, the elements may regulate both genes together. Furthermore, multiple adjacent regulatory elements have similar activity during the IED and could result in similar expression patterns for multiple genes.

In this study, we observed that coregulated genes and regulatory elements in P. falciparum tend to form units of variable sizes throughout the genome. Therefore, to obtain a deeper understanding of transcription regulation organization, a novel algorithm was developed to detect these regulation units, defined as regulatory units (RUs). This study provides the first targeted regulation gene identification method for cis-regulatory elements in the P. falciparum genome. This provides an important step toward exploring the transcription regulation mechanisms of this devastating parasite.

Section snippets

A large proportion of the P. falciparum genome is organized into coordinated units of genes and regulatory elements

Recent studies identified open chromatin accessibility regions in P. falciparum by using ATAC-Seq (Ruiz et al., 2018; Toenhake et al., 2018). More than 4000 open chromatin peaks in P. falciparum intraerythrocytic stages were identified. The positive Pearson correlation between the gene expression level and the ATAC-Seq signals at the neighboring intergenic ATAC-Seq peak indicates that the open chromatin elements in the P. falciparum genome are involved in gene regulation (Ruiz et al., 2018;

Discussion

TFs orchestrate the transcriptional regulation in eukaryote genomes. Some TF binding occurs at clusters of DNA sequences known as ‘enhancers’. Enhancers are cis-regulatory elements and can be located up to 1 Mbp away from the target gene (Williamson et al., 2011). In mammalian genomes, researchers have shown that the bulk of the genome is organized into the domains where the genes are coordinately regulated (Dixon et al., 2012; Shen et al., 2012). These coregulated domains are highly correlated

Public data sets analyzed

Whole P. falciparum genome chromatin accessibility and transcription data come from recently published RNA-Seq and ATAC-Seq data (Toenhake et al., 2018). The synchronized P. falciparum 3D7 parasites at eight consecutive time points (5, 10, 15, 20, 25, 30, 45, 40 hr) during their IED were sequenced. The AP2-I binding sites were obtained from the peak file of ChIP-Seq data (Santos et al., 2017). The ATAC-Seq, RNA-Seq, and ChIP-Seq data were downloaded from the Gene Expression Omnibus database:

Availability of data and materials

The R code that build the hidden Markov model to search the regulatory unit and a README manual on how to use these codes were deposited on GitHub and are available by the link https://github.com/CharleyWang/Regulatory_Module.

CRediT authorship contribution statement

Chengqi Wang, Rays H. Y. Jiang conceived the project, Chengqi Wang developed the method, Chengqi Wang, Justin Gibbons and Swamy R. Adapa provided analysis, Jenna Oberstaller, Xiangyun Liao, Min ZHang, John H. Adams provided data sources. Chengqi Wang, Justin Gibbons, Rays H. Y. Jiang participated in writing.

Acknowledgments

We thank National Institutes of Health RO1AI117017, 5R01AI117017-02, ACS-IRG-14-189-19, NIH-NCI R35CA197731, OPP1023601, NSF 1627352 and USF New Investigator Funding to R.H.Y.J.

References (42)

  • R. Bártfai et al.

    H2A.Z demarcates intergenic regions of the Plasmodium falciparum epigenome that are dynamically marked by H3K9ac and H3K4me3

    PLoS Pathog.

    (2010)
  • Z. Bozdech et al.

    The transcriptome of the intraerythrocytic developmental cycle of Plasmodium falciparum

    PLoS Biol.

    (2003)
  • K.M. Broadbent et al.

    Strand-specific RNA sequencing in Plasmodium falciparum malaria identifies developmentally regulated long non-coding RNA and circular RNA

    BMC Genomics

    (2015)
  • J.D. Buenrostro et al.

    Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position

    Nat. Methods

    (2013)
  • E.M. Bunnik et al.

    Changes in genome organization of parasite-specific gene families during the Plasmodium transmission stages

    Nat. Commun.

    (2018)
  • T.L. Campbell et al.

    Identification and genome-wide prediction of DNA binding specificities for the ApiAP2 family of regulators from the malaria parasite

    PLoS Pathog.

    (2010)
  • B.K. Chaal et al.

    Histone deacetylases play a major role in the transcriptional regulation of the Plasmodium falciparum life cycle

    PLoS Pathog.

    (2010)
  • L. Cui et al.

    Chromatin-mediated epigenetic regulation in the malaria parasite Plasmodium falciparum

    Eukaryot. Cell

    (2010)
  • J.R. Dixon et al.

    Topological domains in mammalian genomes identified by analysis of chromatin interactions

    Nature

    (2012)
  • C. Doerig et al.

    Post-translational protein modifications in malaria parasites

    Nat. Rev. Microbiol.

    (2015)
  • B.J. Foth et al.

    Quantitative protein expression profiling reveals extensive post-transcriptional regulation and post-translational modifications in schizont-stage malaria parasites

    Genome Biol.

    (2008)
  • View full text