Binding specificities of human RNA-binding proteins toward structured and linear RNA sequences

  1. Jussi Taipale1,3,10
  1. 1Department of Medical Biochemistry and Biophysics, Karolinska Institutet, SE-171 77, Solna, Sweden;
  2. 2Department of Biochemistry and Molecular Biology, Mayo Clinic Graduate School of Biomedical Sciences, Mayo Clinic College of Medicine and Science, Rochester, Minnesota 55905, USA;
  3. 3Genome-Scale Biology Program, University of Helsinki, FI-00014, Helsinki, Finland;
  4. 4Department of Molecular Genetics, University of Toronto, M5S 1A8, Toronto, Canada;
  5. 5European Molecular Biology Laboratory (EMBL), Hamburg Unit c/o DESY, D-22603 Hamburg, Germany;
  6. 6Donnelly Centre, University of Toronto, M5S 3E1, Toronto, Canada;
  7. 7Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto, M5S 3G4, Toronto, Canada;
  8. 8Department of Computer Science, University of Toronto, M5S 2E4, Toronto, Canada;
  9. 9Memorial Sloan Kettering Cancer Center, New York, New York 10065, USA;
  10. 10Department of Biochemistry, University of Cambridge, CB2 1QW, Cambridge, United Kingdom
  1. 11 These authors contributed equally to this work.

  • Corresponding author: ajt208{at}cam.ac.uk
  • Abstract

    RNA-binding proteins (RBPs) regulate RNA metabolism at multiple levels by affecting splicing of nascent transcripts, RNA folding, base modification, transport, localization, translation, and stability. Despite their central role in RNA function, the RNA-binding specificities of most RBPs remain unknown or incompletely defined. To address this, we have assembled a genome-scale collection of RBPs and their RNA-binding domains (RBDs) and assessed their specificities using high-throughput RNA-SELEX (HTR-SELEX). Approximately 70% of RBPs for which we obtained a motif bound to short linear sequences, whereas ∼30% preferred structured motifs folding into stem–loops. We also found that many RBPs can bind to multiple distinctly different motifs. Analysis of the matches of the motifs in human genomic sequences suggested novel roles for many RBPs. We found that three cytoplasmic proteins—ZC3H12A, ZC3H12B, and ZC3H12C—bound to motifs resembling the splice donor sequence, suggesting that these proteins are involved in degradation of cytoplasmic viral and/or unspliced transcripts. Structural analysis revealed that the RNA motif was not bound by the conventional C3H1 RNA-binding domain of ZC3H12B. Instead, the RNA motif was bound by the ZC3H12B's PilT N terminus (PIN) RNase domain, revealing a potential mechanism by which unconventional RBDs containing active sites or molecule-binding pockets could interact with short, structured RNA molecules. Our collection containing 145 high-resolution binding specificity models for 86 RBPs is the largest systematic resource for the analysis of human RBPs and will greatly facilitate future analysis of the various biological roles of this important class of proteins.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.258848.119.

    • Freely available online through the Genome Research Open Access option.

    • Received November 7, 2019.
    • Accepted June 23, 2020.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server