当前期刊: Genome Biology Go to current issue    加入关注    本刊投稿指南
显示样式:        排序: IF: - GO 导出
  • The Tetracentron genome provides insight into the early evolution of eudicots and the formation of vessel elements
    Genome Biol. (IF 10.806) Pub Date : 2020-12-02
    Ping-Li Liu; Xi Zhang; Jian-Feng Mao; Yan-Ming Hong; Ren-Gang Zhang; Yilan E; Shuai Nie; Kaihua Jia; Chen-Kun Jiang; Jian He; Weiwei Shen; Qizouhong He; Wenqing Zheng; Samar Abbas; Pawan Kumar Jewaria; Xuechan Tian; Chang-jun Liu; Xiaomei Jiang; Yafang Yin; Bo Liu; Li Wang; Biao Jin; Yongpeng Ma; Zongbo Qiu; František Baluška; Jozef Šamaj; Xinqiang He; Shihui Niu; Jianbo Xie; Lei Xie; Huimin Xu; Hongzhi

    Tetracentron sinense is an endemic and endangered deciduous tree. It belongs to the Trochodendrales, one of four early diverging lineages of eudicots known for having vesselless secondary wood. Sequencing and resequencing of the T. sinense genome will help us understand eudicot evolution, the genetic basis of tracheary element development, and the genetic diversity of this relict species. Here, we

  • Amplification-free long-read sequencing reveals unforeseen CRISPR-Cas9 off-target activity
    Genome Biol. (IF 10.806) Pub Date : 2020-12-01
    Ida Höijer; Josefin Johansson; Sanna Gudmundsson; Chen-Shan Chin; Ignas Bunikis; Susana Häggqvist; Anastasia Emmanouilidou; Maria Wilbe; Marcel den Hoed; Marie-Louise Bondeson; Lars Feuk; Ulf Gyllensten; Adam Ameur

    One ongoing concern about CRISPR-Cas9 genome editing is that unspecific guide RNA (gRNA) binding may induce off-target mutations. However, accurate prediction of CRISPR-Cas9 off-target activity is challenging. Here, we present SMRT-OTS and Nano-OTS, two novel, amplification-free, long-read sequencing protocols for detection of gRNA-driven digestion of genomic DNA by Cas9 in vitro. The methods are assessed

  • Engineering crops of the future: CRISPR approaches to develop climate-resilient and disease-resistant plants
    Genome Biol. (IF 10.806) Pub Date : 2020-11-30
    Syed Shan-e-Ali Zaidi; Ahmed Mahas; Hervé Vanderschuren; Magdy M. Mahfouz

    To meet increasing global food demand, breeders and scientists aim to improve the yield and quality of major food crops. Plant diseases threaten food security and are expected to increase because of climate change. CRISPR genome-editing technology opens new opportunities to engineer disease resistance traits. With precise genome engineering and transgene-free applications, CRISPR is expected to resolve

  • Human A-to-I RNA editing SNP loci are enriched in GWAS signals for autoimmune diseases and under balancing selection
    Genome Biol. (IF 10.806) Pub Date : 2020-11-30
    Hui Zhang; Qiang Fu; Xinrui Shi; Ziqing Pan; Wenbing Yang; Zichao Huang; Tian Tang; Xionglei He; Rui Zhang

    Adenosine-to-inosine (A-to-I) RNA editing plays important roles in diversifying the transcriptome and preventing MDA5 sensing of endogenous dsRNA as nonself. To date, few studies have investigated the population genomic signatures of A-to-I editing due to the lack of editing sites overlapping with SNPs. In this study, we applied a pipeline to robustly identify SNP editing sites from population transcriptomic

  • DNA polymerase epsilon is required for heterochromatin maintenance in Arabidopsis
    Genome Biol. (IF 10.806) Pub Date : 2020-11-25
    Pierre Bourguet; Leticia López-González; Ángeles Gómez-Zambrano; Thierry Pélissier; Amy Hesketh; Magdalena E. Potok; Marie-Noëlle Pouch-Pélissier; Magali Perez; Olivier Da Ines; David Latrasse; Charles I. White; Steven E. Jacobsen; Moussa Benhamed; Olivier Mathieu

    Chromatin organizes DNA and regulates its transcriptional activity through epigenetic modifications. Heterochromatic regions of the genome are generally transcriptionally silent, while euchromatin is more prone to transcription. During DNA replication, both genetic information and chromatin modifications must be faithfully passed on to daughter strands. There is evidence that DNA polymerases play a

  • From FAANG to fork: application of highly annotated genomes to improve farmed animal production
    Genome Biol. (IF 10.806) Pub Date : 2020-11-24
    Emily L. Clark; Alan L. Archibald; Hans D. Daetwyler; Martien A. M. Groenen; Peter W. Harrison; Ross D. Houston; Christa Kühn; Sigbjørn Lien; Daniel J. Macqueen; James M. Reecy; Diego Robledo; Mick Watson; Christopher K. Tuggle; Elisabetta Giuffra

    The Food and Agriculture Organisation of the United Nations (FAO) reports that by the year 2050 the global human population is likely to reach 9.7 billion, rising to 11.2 billion by 2100 (https://population.un.org/wpp/Publications/Files/Key_Findings_WPP_2015.pdf). This population growth poses several challenges to the global food system, which will need to produce more healthy food using fewer natural

  • The evolution of relapse of adult T cell acute lymphoblastic leukemia
    Genome Biol. (IF 10.806) Pub Date : 2020-11-23
    Inés Sentís; Santiago Gonzalez; Eulalia Genescà; Violeta García-Hernández; Ferran Muiños; Celia Gonzalez; Erika López-Arribillaga; Jessica Gonzalez; Lierni Fernandez-Ibarrondo; Loris Mularoni; Lluís Espinosa; Beatriz Bellosillo; Josep-Maria Ribera; Anna Bigas; Abel Gonzalez-Perez; Nuria Lopez-Bigas

    Adult T cell acute lymphoblastic leukemia (T-ALL) is a rare disease that affects less than 10 individuals in one million. It has been less studied than its cognate pediatric malignancy, which is more prevalent. A higher percentage of the adult patients relapse, compared to children. It is thus essential to study the mechanisms of relapse of adult T-ALL cases. We profile whole-genome somatic mutations

  • A pitfall for machine learning methods aiming to predict across cell types
    Genome Biol. (IF 10.806) Pub Date : 2020-11-19
    Jacob Schreiber; Ritambhara Singh; Jeffrey Bilmes; William Stafford Noble

    Machine learning models that predict genomic activity are most useful when they make accurate predictions across cell types. Here, we show that when the training and test sets contain the same genomic loci, the resulting model may falsely appear to perform well by effectively memorizing the average activity associated with each locus across the training cell types. We demonstrate this phenomenon in

  • Deep sequencing reveals a DAP1 regulatory haplotype that potentiates autoimmunity in systemic lupus erythematosus
    Genome Biol. (IF 10.806) Pub Date : 2020-11-19
    Prithvi Raj; Ran Song; Honglin Zhu; Linley Riediger; Dong-Jae Jun; Chaoying Liang; Carlos Arana; Bo Zhang; Yajing Gao; Benjamin E. Wakeland; Igor Dozmorov; Jinchun Zhou; Jennifer A. Kelly; Bernard R. Lauwerys; Joel M. Guthridge; Nancy J. Olsen; Swapan K. Nath; Chandrashekhar Pasare; Nicolai van Oers; Gary Gilkeson; Betty P. Tsao; Patrick M. Gaffney; Peter K. Gregersen; Judith A. James; Xiaoxia Zuo;

    Systemic lupus erythematosus (SLE) is a clinically heterogeneous autoimmune disease characterized by the development of anti-nuclear antibodies. Susceptibility to SLE is multifactorial, with a combination of genetic and environmental risk factors contributing to disease development. Like other polygenic diseases, a significant proportion of estimated SLE heritability is not accounted for by common

  • Stairway Plot 2: demographic history inference with folded SNP frequency spectra
    Genome Biol. (IF 10.806) Pub Date : 2020-11-17
    Xiaoming Liu; Yun-Xin Fu

    Inferring the demographic histories of populations has wide applications in population, ecological, and conservation genomics. We present Stairway Plot 2, a cross-platform program package for this task using SNP frequency spectra. It is based on a nonparametric method with the capability of handling folded SNP frequency spectra (that is, when the ancestral alleles of the SNPs are unknown) of thousands

  • A versatile toolkit for CRISPR-Cas13-based RNA manipulation in Drosophila
    Genome Biol. (IF 10.806) Pub Date : 2020-11-17
    Nhan Huynh; Noah Depner; Raegan Larson; Kirst King-Jones

    Advances in CRISPR technology have immensely improved our ability to manipulate nucleic acids, and the recent discovery of the RNA-targeting endonuclease Cas13 adds even further functionality. Here, we show that Cas13 works efficiently in Drosophila, both ex vivo and in vivo. We test 44 different Cas13 variants to identify enzymes with the best overall performance and show that Cas13 could target endogenous

  • Navigating the crowd: visualizing coordination between genome dynamics, structure, and transcription
    Genome Biol. (IF 10.806) Pub Date : 2020-11-17
    Haitham A. Shaban; Roman Barth; Kerstin Bystricky

    The eukaryotic genome is hierarchically structured yet highly dynamic. Regulating transcription in this environment demands a high level of coordination to permit many proteins to interact with chromatin fiber at appropriate sites in a timely manner. We describe how recent advances in quantitative imaging techniques overcome caveats of sequencing-based methods (Hi-C and related) by enabling direct

  • A DNA methylation state transition model reveals the programmed epigenetic heterogeneity in human pre-implantation embryos
    Genome Biol. (IF 10.806) Pub Date : 2020-11-16
    Chengchen Zhao; Naiqian Zhang; Yalin Zhang; Nuermaimaiti Tuersunjiang; Shaorong Gao; Wenqiang Liu; Yong Zhang

    During mammalian early embryogenesis, expression and epigenetic heterogeneity emerge before the first cell fate determination, but the programs causing such determinate heterogeneity are largely unexplored. Here, we present MethylTransition, a novel DNA methylation state transition model, for characterizing methylation changes during one or a few cell cycles at single-cell resolution. MethylTransition

  • Do malignant cells sleep at night?
    Genome Biol. (IF 10.806) Pub Date : 2020-11-12
    Luis Enrique Cortés-Hernández; Zahra Eslami-S; Antoine M. Dujon; Mathieu Giraudeau; Beata Ujvari; Frédéric Thomas; Catherine Alix-Panabières

    Biological rhythms regulate the biology of most, if not all living creatures, from whole organisms to their constitutive cells, their microbiota, and also parasites. Here, we present the hypothesis that internal and external ecological variations induced by biological cycles also influence or are exploited by cancer cells, especially by circulating tumor cells, the key players in the metastatic cascade

  • Massive gene presence-absence variation shapes an open pan-genome in the Mediterranean mussel
    Genome Biol. (IF 10.806) Pub Date : 2020-11-10
    Marco Gerdol; Rebeca Moreira; Fernando Cruz; Jessica Gómez-Garrido; Anna Vlasova; Umberto Rosani; Paola Venier; Miguel A. Naranjo-Ortiz; Maria Murgarella; Samuele Greco; Pablo Balseiro; André Corvelo; Leonor Frias; Marta Gut; Toni Gabaldón; Alberto Pallavicini; Carlos Canchaya; Beatriz Novoa; Tyler S. Alioto; David Posada; Antonio Figueras

    The Mediterranean mussel Mytilus galloprovincialis is an ecologically and economically relevant edible marine bivalve, highly invasive and resilient to biotic and abiotic stressors causing recurrent massive mortalities in other bivalves. Although these traits have been recently linked with the maintenance of a high genetic variation within natural populations, the factors underlying the evolutionary

  • SVFX: a machine learning framework to quantify the pathogenicity of structural variants
    Genome Biol. (IF 10.806) Pub Date : 2020-11-09
    Sushant Kumar; Arif Harmanci; Jagath Vytheeswaran; Mark B. Gerstein

    There is a lack of approaches for identifying pathogenic genomic structural variants (SVs) although they play a crucial role in many diseases. We present a mechanism-agnostic machine learning-based workflow, called SVFX, to assign pathogenicity scores to somatic and germline SVs. In particular, we generate somatic and germline training models, which include genomic, epigenomic, and conservation-based

  • Pathway information extracted from 25 years of pathway figures
    Genome Biol. (IF 10.806) Pub Date : 2020-11-09
    Kristina Hanspers; Anders Riutta; Martina Summer-Kutmon; Alexander R. Pico

    Thousands of pathway diagrams are published each year as static figures inaccessible to computational queries and analyses. Using a combination of machine learning, optical character recognition, and manual curation, we identified 64,643 pathway figures published between 1995 and 2019 and extracted 1,112,551 instances of human genes, comprising 13,464 unique NCBI genes, participating in a wide variety

  • Fission yeast condensin contributes to interphase chromatin organization and prevents transcription-coupled DNA damage
    Genome Biol. (IF 10.806) Pub Date : 2020-11-05
    Yasutaka Kakui; Christopher Barrington; David J. Barry; Tereza Gerguri; Xiao Fu; Paul A. Bates; Bhavin S. Khatri; Frank Uhlmann

    Structural maintenance of chromosomes (SMC) complexes are central organizers of chromatin architecture throughout the cell cycle. The SMC family member condensin is best known for establishing long-range chromatin interactions in mitosis. These compact chromatin and create mechanically stable chromosomes. How condensin contributes to chromatin organization in interphase is less well understood. Here

  • Multiomics profiling of primary lung cancers and distant metastases reveals immunosuppression as a common characteristic of tumor cells with metastatic plasticity
    Genome Biol. (IF 10.806) Pub Date : 2020-11-04
    Won-Chul Lee; Alexandre Reuben; Xin Hu; Nicholas McGranahan; Runzhe Chen; Ali Jalali; Marcelo V. Negrao; Shawna M. Hubert; Chad Tang; Chia-Chin Wu; Anthony San Lucas; Whijae Roh; Kenichi Suda; Jihye Kim; Aik-Choon Tan; David H. Peng; Wei Lu; Ximing Tang; Chi-Wan Chow; Junya Fujimoto; Carmen Behrens; Neda Kalhor; Kazutaka Fukumura; Marcus Coyle; Rebecca Thornton; Curtis Gumbs; Jun Li; Chang-Jiun Wu;

    Metastasis is the primary cause of cancer mortality accounting for 90% of cancer deaths. Our understanding of the molecular mechanisms driving metastasis is rudimentary. We perform whole exome sequencing (WES), RNA sequencing, methylation microarray, and immunohistochemistry (IHC) on 8 pairs of non-small cell lung cancer (NSCLC) primary tumors and matched distant metastases. Furthermore, we analyze

  • Publisher Correction: Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics
    Genome Biol. (IF 10.806) Pub Date : 2020-11-03
    Kwangbom Choi; Yang Chen; Daniel A. Skelly; Gary A. Churchill

    An amendment to this paper has been published and can be accessed via the original article.

  • Cis-acting lnc-eRNA SEELA directly binds histone H4 to promote histone recognition and leukemia progression
    Genome Biol. (IF 10.806) Pub Date : 2020-11-03
    Ke Fang; Wei Huang; Yu-Meng Sun; Tian-Qi Chen; Zhan-Cheng Zeng; Qian-Qian Yang; Qi Pan; Cai Han; Lin-Yu Sun; Xue-Qun Luo; Wen-Tao Wang; Yue-Qin Chen

    Long noncoding enhancer RNAs (lnc-eRNAs) are a subset of stable eRNAs identified from annotated lncRNAs. They might act as enhancer activity-related therapeutic targets in cancer. However, the underlying mechanism of epigenetic activation and their function in cancer initiation and progression remain largely unknown. We identify a set of lncRNAs as lnc-eRNAs according to the epigenetic signatures of

  • RNA editing in cancer impacts mRNA abundance in immune response pathways
    Genome Biol. (IF 10.806) Pub Date : 2020-10-26
    Tracey W. Chan; Ting Fu; Jae Hoon Bahn; Hyun-Ik Jun; Jae-Hyung Lee; Giovanni Quinones-Valdez; Chonghui Cheng; Xinshu Xiao

    RNA editing generates modifications to the RNA sequences, thereby increasing protein diversity and shaping various layers of gene regulation. Recent studies have revealed global shifts in editing levels across many cancer types, as well as a few specific mechanisms implicating individual sites in tumorigenesis or metastasis. However, most tumor-associated sites, predominantly in noncoding regions,

  • Characterization of an eutherian gene cluster generated after transposon domestication identifies Bex3 as relevant for advanced neurological functions
    Genome Biol. (IF 10.806) Pub Date : 2020-10-26
    Enrique Navas-Pérez; Cristina Vicente-García; Serena Mirra; Demian Burguera; Noèlia Fernàndez-Castillo; José Luis Ferrán; Macarena López-Mayorga; Marta Alaiz-Noya; Irene Suárez-Pereira; Ester Antón-Galindo; Fausto Ulloa; Carlos Herrera-Úbeda; Pol Cuscó; Rafael Falcón-Moya; Antonio Rodríguez-Moreno; Salvatore D’Aniello; Bru Cormand; Gemma Marfany; Eduardo Soriano; Ángel M. Carrión; Jaime J. Carvajal;

    One of the most unusual sources of phylogenetically restricted genes is the molecular domestication of transposable elements into a host genome as functional genes. Although these kinds of events are sometimes at the core of key macroevolutionary changes, their origin and organismal function are generally poorly understood. Here, we identify several previously unreported transposable element domestication

  • High throughput single-cell detection of multiplex CRISPR-edited gene modifications
    Genome Biol. (IF 10.806) Pub Date : 2020-10-20
    Elisa ten Hacken; Kendell Clement; Shuqiang Li; María Hernández-Sánchez; Robert Redd; Shu Wang; David Ruff; Michaela Gruber; Kaitlyn Baranowski; Jose Jacob; James Flynn; Keith W. Jones; Donna Neuberg; Kenneth J. Livak; Luca Pinello; Catherine J. Wu

    CRISPR-Cas9 gene editing has transformed our ability to rapidly interrogate the functional impact of somatic mutations in human cancers. Droplet-based technology enables the analysis of Cas9-introduced gene edits in thousands of single cells. Using this technology, we analyze Ba/F3 cells engineered to express single or multiplexed loss-of-function mutations recurrent in chronic lymphocytic leukemia

  • The design and construction of reference pangenome graphs with minigraph
    Genome Biol. (IF 10.806) Pub Date : 2020-10-16
    Heng Li; Xiaowen Feng; Chong Chu

    The recent advances in sequencing technologies enable the assembly of individual genomes to the quality of the reference genome. How to integrate multiple genomes from the same species and make the integrated representation accessible to biologists remains an open challenge. Here, we propose a graph-based data model and associated formats to represent multiple genomes while preserving the coordinate

  • Multiomics kaleidoscope to visualize cancer hallmarks
    Genome Biol. (IF 10.806) Pub Date : 2020-10-15
    Shengtao Zhou

    Today, we have entered a data-explosive realm, which requires us to have a rational and clear viewpoint to visualize the underlying contour of a spectrum of events, especially life sciences. Cancer biology, as an important branch of life sciences, also experiences an explosion of data and related molecular characterization. Our evolving understanding of cancer hallmarks derives from rapid progress

  • Clonal tracing reveals diverse patterns of response to immune checkpoint blockade
    Genome Biol. (IF 10.806) Pub Date : 2020-10-15
    Shengqing Stan Gu; Xiaoqing Wang; Xihao Hu; Peng Jiang; Ziyi Li; Nicole Traugh; Xia Bu; Qin Tang; Chenfei Wang; Zexian Zeng; Jingxin Fu; Cliff Meyer; Yi Zhang; Paloma Cejas; Klothilda Lim; Jin Wang; Wubing Zhang; Collin Tokheim; Avinash Das Sahu; Xiaofang Xing; Benjamin Kroger; Zhangyi Ouyang; Henry Long; Gordon J. Freeman; Myles Brown; X. Shirley Liu

    Immune checkpoint blockade (ICB) therapy has improved patient survival in a variety of cancers, but only a minority of cancer patients respond. Multiple studies have sought to identify general biomarkers of ICB response, but elucidating the molecular and cellular drivers of resistance for individual tumors remains challenging. We sought to determine whether a tumor with defined genetic background exhibits

  • Multiplex enCas12a screens detect functional buffering among paralogs otherwise masked in monogenic Cas9 knockout screens
    Genome Biol. (IF 10.806) Pub Date : 2020-10-15
    Merve Dede; Megan McLaughlin; Eiru Kim; Traver Hart

    Pooled library CRISPR/Cas9 knockout screening across hundreds of cell lines has identified genes whose disruption leads to fitness defects, a critical step in identifying candidate cancer targets. However, the number of essential genes detected from these monogenic knockout screens is low compared to the number of constitutively expressed genes in a cell. Through a systematic analysis of screen data

  • iMOKA: k-mer based software to analyze large collections of sequencing data
    Genome Biol. (IF 10.806) Pub Date : 2020-10-13
    Claudio Lorenzi; Sylvain Barriere; Jean-Philippe Villemin; Laureline Dejardin Bretones; Alban Mancheron; William Ritchie

    iMOKA (interactive multi-objective k-mer analysis) is a software that enables comprehensive analysis of sequencing data from large cohorts to generate robust classification models or explore specific genetic elements associated with disease etiology. iMOKA uses a fast and accurate feature reduction step that combines a Naïve Bayes classifier augmented by an adaptive entropy filter and a graph-based

  • An integrated peach genome structural variation map uncovers genes associated with fruit traits
    Genome Biol. (IF 10.806) Pub Date : 2020-10-06
    Jian Guo; Ke Cao; Cecilia Deng; Yong Li; Gengrui Zhu; Weichao Fang; Changwen Chen; Xinwei Wang; Jinlong Wu; Liping Guan; Shan Wu; Wenwu Guo; Jia-Long Yao; Zhangjun Fei; Lirong Wang

    Genome structural variations (SVs) have been associated with key traits in a wide range of agronomically important species; however, SV profiles of peach and their functional impacts remain largely unexplored. Here, we present an integrated map of 202,273 SVs from 336 peach genomes. A substantial number of SVs have been selected during peach domestication and improvement, which together affect 2268

  • Prime editing efficiently generates W542L and S621I double mutations in two ALS genes in maize
    Genome Biol. (IF 10.806) Pub Date : 2020-10-06
    Yuan-Yuan Jiang; Yi-Ping Chai; Min-Hui Lu; Xiu-Li Han; Qiupeng Lin; Yu Zhang; Qiang Zhang; Yun Zhou; Xue-Chen Wang; Caixia Gao; Qi-Jun Chen

    Prime editing is a novel and universal CRISPR/Cas-derived precision genome-editing technology that has been recently developed. However, low efficiency of prime editing has been shown in transgenic rice lines. We hypothesize that enhancing pegRNA expression could improve prime-editing efficiency. In this report, we describe two strategies for enhancing pegRNA expression. We construct a prime editing

  • AlphaBeta: computational inference of epimutation rates and spectra from high-throughput DNA methylation data in plants
    Genome Biol. (IF 10.806) Pub Date : 2020-10-06
    Yadollah Shahryary; Aikaterini Symeonidi; Rashmi R. Hazarika; Johanna Denkena; Talha Mubeen; Brigitte Hofmeister; Thomas van Gurp; Maria Colomé-Tatché; Koen J.F. Verhoeven; Gerald Tuskan; Robert J. Schmitz; Frank Johannes

    Stochastic changes in DNA methylation (i.e., spontaneous epimutations) contribute to methylome diversity in plants. Here, we describe AlphaBeta, a computational method for estimating the precise rate of such stochastic events using pedigree-based DNA methylation data as input. We demonstrate how AlphaBeta can be employed to study transgenerationally heritable epimutations in clonal or sexually derived

  • A genome assembly and the somatic genetic and epigenetic mutation rate in a wild long-lived perennial Populus trichocarpa
    Genome Biol. (IF 10.806) Pub Date : 2020-10-06
    Brigitte T. Hofmeister; Johanna Denkena; Maria Colomé-Tatché; Yadollah Shahryary; Rashmi Hazarika; Jane Grimwood; Sujan Mamidi; Jerry Jenkins; Paul P. Grabowski; Avinash Sreedasyam; Shengqiang Shu; Kerrie Barry; Kathleen Lail; Catherine Adam; Anna Lipzen; Rotem Sorek; Dave Kudrna; Jayson Talag; Rod Wing; David W. Hall; Daniel Jacobsen; Gerald A. Tuskan; Jeremy Schmutz; Frank Johannes; Robert J. Schmitz

    Plants can transmit somatic mutations and epimutations to offspring, which in turn can affect fitness. Knowledge of the rate at which these variations arise is necessary to understand how plant development contributes to local adaption in an ecoevolutionary context, particularly in long-lived perennials. Here, we generate a new high-quality reference genome from the oldest branch of a wild Populus

  • Mustache: multi-scale detection of chromatin loops from Hi-C and Micro-C maps using scale-space representation
    Genome Biol. (IF 10.806) Pub Date : 2020-09-30
    Abbas Roayaei Ardakany; Halil Tuvan Gezer; Stefano Lonardi; Ferhat Ay

    We present Mustache, a new method for multi-scale detection of chromatin loops from Hi-C and Micro-C contact maps. Mustache employs scale-space theory, a technical advance in computer vision, to detect blob-shaped objects in contact maps. Mustache is scalable to kilobase-resolution maps and reports loops that are highly consistent between replicates and between Hi-C and Micro-C datasets. Compared to

  • Tissue-specific usage of transposable element-derived promoters in mouse development
    Genome Biol. (IF 10.806) Pub Date : 2020-09-28
    Benpeng Miao; Shuhua Fu; Cheng Lyu; Paul Gontarz; Ting Wang; Bo Zhang

    Transposable elements (TEs) are a significant component of eukaryotic genomes and play essential roles in genome evolution. Mounting evidence indicates that TEs are highly transcribed in early embryo development and contribute to distinct biological functions and tissue morphology. We examine the epigenetic dynamics of mouse TEs during the development of five tissues: intestine, liver, lung, stomach

  • A systematic comparison of chloroplast genome assembly tools
    Genome Biol. (IF 10.806) Pub Date : 2020-09-28
    Jan A. Freudenthal; Simon Pfaff; Niklas Terhoeven; Arthur Korte; Markus J. Ankenbrand; Frank Förster

    Chloroplasts are intracellular organelles that enable plants to conduct photosynthesis. They arose through the symbiotic integration of a prokaryotic cell into an eukaryotic host cell and still contain their own genomes with distinct genomic information. Plastid genomes accommodate essential genes and are regularly utilized in biotechnology or phylogenetics. Different assemblers that are able to assess

  • GraphAligner: rapid and versatile sequence-to-graph alignment
    Genome Biol. (IF 10.806) Pub Date : 2020-09-24
    Mikko Rautiainen; Tobias Marschall

    Genome graphs can represent genetic variation and sequence uncertainty. Aligning sequences to genome graphs is key to many applications, including error correction, genome assembly, and genotyping of variants in a pangenome graph. Yet, so far, this step is often prohibitively slow. We present GraphAligner, a tool for aligning long reads to genome graphs. Compared to the state-of-the-art tools, GraphAligner

  • Haplotype threading: accurate polyploid phasing from long reads.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-21
    Sven D Schrinner,Rebecca Serra Mari,Jana Ebler,Mikko Rautiainen,Lancelot Seillier,Julia J Reimer,Björn Usadel,Tobias Marschall,Gunnar W Klau

    Resolving genomes at haplotype level is crucial for understanding the evolutionary history of polyploid species and for designing advanced breeding strategies. Polyploid phasing still presents considerable challenges, especially in regions of collapsing haplotypes.We present WhatsHap polyphase, a novel two-stage approach that addresses these challenges by (i) clustering reads and (ii) threading the

  • Chromatin regulates expression of small RNAs to help maintain transposon methylome homeostasis in Arabidopsis.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-17
    Ranjith K Papareddy,Katalin Páldi,Subramanian Paulraj,Ping Kao,Stefan Lutzmayer,Michael D Nodine

    Eukaryotic genomes are partitioned into euchromatic and heterochromatic domains to regulate gene expression and other fundamental cellular processes. However, chromatin is dynamic during growth and development and must be properly re-established after its decondensation. Small interfering RNAs (siRNAs) promote heterochromatin formation, but little is known about how chromatin regulates siRNA expression

  • Removing reference bias and improving indel calling in ancient DNA data analysis by mapping to a sequence variation graph.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-17
    Rui Martiniano,Erik Garrison,Eppie R Jones,Andrea Manica,Richard Durbin

    During the last decade, the analysis of ancient DNA (aDNA) sequence has become a powerful tool for the study of past human populations. However, the degraded nature of aDNA means that aDNA molecules are short and frequently mutated by post-mortem chemical modifications. These features decrease read mapping accuracy and increase reference bias, in which reads containing non-reference alleles are less

  • Bifrost: highly parallel construction and indexing of colored and compacted de Bruijn graphs.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-17
    Guillaume Holley,Páll Melsted

    Memory consumption of de Bruijn graphs is often prohibitive. Most de Bruijn graph-based assemblers reduce the complexity by compacting paths into single vertices, but this is challenging as it requires the uncompacted de Bruijn graph to be available in memory. We present a parallel and memory-efficient algorithm enabling the direct construction of the compacted de Bruijn graph without producing the

  • Ultrasensitive deletion detection links mitochondrial DNA replication, disease, and aging.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-17
    Scott A Lujan,Matthew J Longley,Margaret H Humble,Christopher A Lavender,Adam Burkholder,Emma L Blakely,Charlotte L Alston,Grainne S Gorman,Doug M Turnbull,Robert McFarland,Robert W Taylor,Thomas A Kunkel,William C Copeland

    Acquired human mitochondrial genome (mtDNA) deletions are symptoms and drivers of focal mitochondrial respiratory deficiency, a pathological hallmark of aging and late-onset mitochondrial disease. To decipher connections between these processes, we create LostArc, an ultrasensitive method for quantifying deletions in circular mtDNA molecules. LostArc reveals 35 million deletions (~ 470,000 unique spans)

  • Cancer-specific CTCF binding facilitates oncogenic transcriptional dysregulation.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-15
    Celestia Fang,Zhenjia Wang,Cuijuan Han,Stephanie L Safgren,Kathryn A Helmin,Emmalee R Adelman,Valentina Serafin,Giuseppe Basso,Kyle P Eagen,Alexandre Gaspar-Maia,Maria E Figueroa,Benjamin D Singer,Aakrosh Ratan,Panagiotis Ntziachristos,Chongzhi Zang

    The three-dimensional genome organization is critical for gene regulation and can malfunction in diseases like cancer. As a key regulator of genome organization, CCCTC-binding factor (CTCF) has been characterized as a DNA-binding protein with important functions in maintaining the topological structure of chromatin and inducing DNA looping. Among the prolific binding sites in the genome, several events

  • AuthentiCT: a model of ancient DNA damage to estimate the proportion of present-day DNA contamination.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-15
    Stéphane Peyrégne,Benjamin M Peter

    Contamination from present-day DNA is a fundamental issue when studying ancient DNA from historical or archaeological material, and quantifying the amount of contamination is essential for downstream analyses. We present AuthentiCT, a command-line tool to estimate the proportion of present-day DNA contamination in ancient DNA datasets generated from single-stranded DNA libraries. The prediction is

  • Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-14
    Arang Rhie,Brian P Walenz,Sergey Koren,Adam M Phillippy

    Recent long-read assemblies often exceed the quality and completeness of available reference genomes, making validation challenging. Here we present Merqury, a novel tool for reference-free assembly evaluation based on efficient k-mer set operations. By comparing k-mers in a de novo assembly to those found in unassembled high-accuracy reads, Merqury estimates base-level accuracy and completeness. For

  • Primo: integration of multiple GWAS and omics QTL summary statistics for elucidation of molecular mechanisms of trait-associated SNPs and detection of pleiotropy in complex traits.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-11
    Kevin J Gleason,Fan Yang,Brandon L Pierce,Xin He,Lin S Chen

    To provide a comprehensive mechanistic interpretation of how known trait-associated SNPs affect complex traits, we propose a method, Primo, for integrative analysis of GWAS summary statistics with multiple sets of omics QTL summary statistics from different cellular conditions or studies. Primo examines association patterns of SNPs to complex and omics traits. In gene regions harboring known susceptibility

  • sn-spMF: matrix factorization informs tissue-specific genetic regulation of gene expression.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-11
    Yuan He,Surya B Chhetri,Marios Arvanitis,Kaushik Srinivasan,François Aguet,Kristin G Ardlie,Alvaro N Barbeira,Rodrigo Bonazzola,Hae Kyung Im,,Christopher D Brown,Alexis Battle

    Genetic regulation of gene expression, revealed by expression quantitative trait loci (eQTLs), exhibits complex patterns of tissue-specific effects. Characterization of these patterns may allow us to better understand mechanisms of gene regulation and disease etiology. We develop a constrained matrix factorization model, sn-spMF, to learn patterns of tissue-sharing and apply it to 49 human tissues

  • A vast resource of allelic expression data spanning human tissues.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-11
    Stephane E Castel,François Aguet,Pejman Mohammadi,,Kristin G Ardlie,Tuuli Lappalainen

    Allele expression (AE) analysis robustly measures cis-regulatory effects. Here, we present and demonstrate the utility of a vast AE resource generated from the GTEx v8 release, containing 15,253 samples spanning 54 human tissues for a total of 431 million measurements of AE at the SNP level and 153 million measurements at the haplotype level. In addition, we develop an extension of our tool phASER

  • Impact of admixture and ancestry on eQTL analysis and GWAS colocalization in GTEx.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-11
    Nicole R Gay,Michael Gloudemans,Margaret L Antonio,Nathan S Abell,Brunilda Balliu,YoSon Park,Alicia R Martin,Shaila Musharoff,Abhiram S Rao,François Aguet,Alvaro N Barbeira,Rodrigo Bonazzola,Farhad Hormozdiari,,Kristin G Ardlie,Christopher D Brown,Hae Kyung Im,Tuuli Lappalainen,Xiaoquan Wen,Stephen B Montgomery

    Population structure among study subjects may confound genetic association studies, and lack of proper correction can lead to spurious findings. The Genotype-Tissue Expression (GTEx) project largely contains individuals of European ancestry, but the v8 release also includes up to 15% of individuals of non-European ancestry. Assessing ancestry-based adjustments in GTEx improves portability of this research

  • PTWAS: investigating tissue-relevant causal molecular mechanisms of complex traits using probabilistic TWAS analysis.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-11
    Yuhua Zhang,Corbin Quick,Ketian Yu,Alvaro Barbeira,,Francesca Luca,Roger Pique-Regi,Hae Kyung Im,Xiaoquan Wen

    We propose a new computational framework, probabilistic transcriptome-wide association study (PTWAS), to investigate causal relationships between gene expressions and complex traits. PTWAS applies the established principles from instrumental variables analysis and takes advantage of probabilistic eQTL annotations to delineate and tackle the unique challenges arising in TWAS. PTWAS not only confers

  • Estimating the quality of eukaryotic genomes recovered from metagenomic analysis with EukCC.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-10
    Paul Saary,Alex L Mitchell,Robert D Finn

    Microbial eukaryotes constitute a significant fraction of biodiversity and have recently gained more attention, but the recovery of high-quality metagenomic assembled eukaryotic genomes is limited by the current availability of tools. To help address this, we have developed EukCC, a tool for estimating the quality of eukaryotic genomes based on the automated dynamic selection of single copy marker

  • STARR-seq identifies active, chromatin-masked, and dormant enhancers in pluripotent mouse embryonic stem cells.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-10
    Tianran Peng,Yanan Zhai,Yaser Atlasi,Menno Ter Huurne,Hendrik Marks,Hendrik G Stunnenberg,Wout Megchelenbrink

    Enhancers are distal regulators of gene expression that shape cell identity and control cell fate transitions. In mouse embryonic stem cells (mESCs), the pluripotency network is maintained by the function of a complex network of enhancers, that are drastically altered upon differentiation. Genome-wide chromatin accessibility and histone modification assays are commonly used as a proxy for identifying

  • Metalign: efficient alignment-based metagenomic profiling via containment min hash.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-10
    Nathan LaPierre,Mohammed Alser,Eleazar Eskin,David Koslicki,Serghei Mangul

    Metagenomic profiling, predicting the presence and relative abundances of microbes in a sample, is a critical first step in microbiome analysis. Alignment-based approaches are often considered accurate yet computationally infeasible. Here, we present a novel method, Metalign, that performs efficient and accurate alignment-based metagenomic profiling. We use a novel containment min hash approach to

  • GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-10
    Jian-Jun Jin,Wen-Bin Yu,Jun-Bo Yang,Yu Song,Claude W dePamphilis,Ting-Shuang Yi,De-Zhu Li

    GetOrganelle is a state-of-the-art toolkit to accurately assemble organelle genomes from whole genome sequencing data. It recruits organelle-associated reads using a modified “baiting and iterative mapping” approach, conducts de novo assembly, filters and disentangles the assembly graph, and produces all possible configurations of circular organelle genomes. For 50 published plant datasets, we are

  • Natural display of nuclear-encoded RNA on the cell surface and its impact on cell interaction.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-10
    Norman Huang,Xiaochen Fan,Kathia Zaleta-Rivera,Tri C Nguyen,Jiarong Zhou,Yingjun Luo,Jie Gao,Ronnie H Fang,Zhangming Yan,Zhen Bouman Chen,Liangfang Zhang,Sheng Zhong

    Compared to proteins, glycans, and lipids, much less is known about RNAs on the cell surface. We develop a series of technologies to test for any nuclear-encoded RNAs that are stably attached to the cell surface and exposed to the extracellular space, hereafter called membrane-associated extracellular RNAs (maxRNAs). We develop a technique called Surface-seq to selectively sequence maxRNAs and validate

  • COCOA: coordinate covariation analysis of epigenetic heterogeneity.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-07
    John T Lawson,Jason P Smith,Stefan Bekiranov,Francine E Garrett-Bakelman,Nathan C Sheffield

    A key challenge in epigenetics is to determine the biological significance of epigenetic variation among individuals. We present Coordinate Covariation Analysis (COCOA), a computational framework that uses covariation of epigenetic signals across individuals and a database of region sets to annotate epigenetic heterogeneity. COCOA is the first such tool for DNA methylation data and can also analyze

  • Alignment and mapping methodology influence transcript abundance estimation.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-07
    Avi Srivastava,Laraib Malik,Hirak Sarkar,Mohsen Zakeri,Fatemeh Almodaresi,Charlotte Soneson,Michael I Love,Carl Kingsford,Rob Patro

    The accuracy of transcript quantification using RNA-seq data depends on many factors, such as the choice of alignment or mapping method and the quantification model being adopted. While the choice of quantification model has been shown to be important, considerably less attention has been given to comparing the effect of various read alignment approaches on quantification accuracy. We investigate the

  • Variation around the dominant viral genome sequence contributes to viral load and outcome in patients with Ebola virus disease.
    Genome Biol. (IF 10.806) Pub Date : 2020-09-07
    Xiaofeng Dong,Jordana Munoz-Basagoiti,Natasha Y Rickett,Georgios Pollakis,William A Paxton,Stephan Günther,Romy Kerber,Lisa F P Ng,Michael J Elmore,N'faly Magassouba,Miles W Carroll,David A Matthews,Julian A Hiscox

    Viral load is a major contributor to outcome in patients with Ebola virus disease (EVD), with high values leading to a fatal outcome. Evidence from the 2013–2016 Ebola virus (EBOV) outbreak indicated that different genotypes of the virus can have different phenotypes in patients. Additionally, due to the error-prone nature of viral RNA synthesis in an individual patient, the EBOV genome exists around