Dynamic transcriptional and chromatin accessibility landscape of medaka embryogenesis

  1. Qiang Tu1,2,3
  1. 1State Key Laboratory of Molecular Developmental Biology, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China;
  2. 2Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China;
  3. 3University of Chinese Academy of Sciences, Beijing 100049, China;
  4. 4Laboratory of Bioresources, National Institute for Basic Biology, Okazaki 444-8585, Aichi, Japan
  1. 5 These authors contributed equally to this work.

  • Corresponding authors: naruse{at}nibb.ac.jp, qtu{at}genetics.ac.cn
  • Abstract

    Medaka (Oryzias latipes) has become an important vertebrate model widely used in genetics, developmental biology, environmental sciences, and many other fields. A high-quality genome sequence and a variety of genetic tools are available for this model organism. However, existing genome annotation is still rudimentary, as it was mainly based on computational prediction and short-read RNA-seq data. Here we report a dynamic transcriptome landscape of medaka embryogenesis profiled by long-read RNA-seq, short-read RNA-seq, and ATAC-seq. By integrating these data sets, we constructed a much-improved gene model set including about 17,000 novel isoforms and identified 1600 transcription factors, 1100 long noncoding RNAs, and 150,000 potential cis-regulatory elements as well. Time-series data sets provided another dimension of information. With the expression dynamics of genes and accessibility dynamics of cis-regulatory elements, we investigated isoform switching, as well as regulatory logic between accessible elements and genes, during embryogenesis. We built a user-friendly medaka omics data portal to present these data sets. This resource provides the first comprehensive omics data sets of medaka embryogenesis. Ultimately, we term these three assays as the minimum ENCODE toolbox and propose the use of it as the initial and essential profiling genomic assays for model organisms that have limited data available. This work will be of great value for the research community using medaka as the model organism and many others as well.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.258871.119.

    • Freely available online through the Genome Research Open Access option.

    • Received October 31, 2019.
    • Accepted June 17, 2020.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server