当前期刊: Genome Biology Go to current issue    加入关注   
显示样式:        排序: IF: - GO 导出
  • Single-cell ATAC-seq signal extraction and enhancement with SCATE
    Genome Biol. (IF 10.806) Pub Date : 2020-07-03
    Zhicheng Ji; Weiqiang Zhou; Wenpin Hou; Hongkai Ji

    Single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq) is the state-of-the-art technology for analyzing genome-wide regulatory landscapes in single cells. Single-cell ATAC-seq data are sparse and noisy, and analyzing such data is challenging. Existing computational methods cannot accurately reconstruct activities of individual cis-regulatory elements (CREs) in individual cells

  • Quantile normalization of single-cell RNA-seq read counts without unique molecular identifiers
    Genome Biol. (IF 10.806) Pub Date : 2020-07-03
    F. William Townes; Rafael A. Irizarry

    Single-cell RNA-seq (scRNA-seq) profiles gene expression of individual cells. Unique molecular identifiers (UMIs) remove duplicates in read counts resulting from polymerase chain reaction, a major source of noise. For scRNA-seq data lacking UMIs, we propose quasi-UMIs: quantile normalization of read counts to a compound Poisson distribution empirically derived from UMI datasets. When applied to ground-truth

  • Genomic analysis of the domestication and post-Spanish conquest evolution of the llama and alpaca
    Genome Biol. (IF 10.806) Pub Date : 2020-07-02
    Ruiwen Fan; Zhongru Gu; Xuanmin Guang; Juan Carlos Marín; Valeria Varas; Benito A. González; Jane C. Wheeler; Yafei Hu; Erli Li; Xiaohui Sun; Xukui Yang; Chi Zhang; Wenjun Gao; Junping He; Kasper Munch; Russel Corbett-Detig; Mario Barbato; Shengkai Pan; Xiangjiang Zhan; Michael W. Bruford; Changsheng Dong

    Despite their regional economic importance and being increasingly reared globally, the origins and evolution of the llama and alpaca remain poorly understood. Here we report reference genomes for the llama, and for the guanaco and vicuña (their putative wild progenitors), compare these with the published alpaca genome, and resequence seven individuals of all four species to better understand domestication

  • Genome-wide analyses of chromatin interactions after the loss of Pol I, Pol II, and Pol III
    Genome Biol. (IF 10.806) Pub Date : 2020-07-02
    Yongpeng Jiang; Jie Huang; Kehuan Lun; Boyuan Li; Haonan Zheng; Yuanjun Li; Rong Zhou; Wenjia Duan; Chenlu Wang; Yuanqing Feng; Hong Yao; Cheng Li; Xiong Ji

    The relationship between transcription and the 3D chromatin structure is debated. Multiple studies have shown that transcription affects global Cohesin binding and 3D genome structures. However, several other studies have indicated that inhibited transcription does not alter chromatin conformations. We provide the most comprehensive evidence to date to demonstrate that transcription plays a relatively

  • Analysis of endothelial-to-haematopoietic transition at the single cell level identifies cell cycle regulation as a driver of differentiation
    Genome Biol. (IF 10.806) Pub Date : 2020-07-01
    Giovanni Canu; Emmanouil Athanasiadis; Rodrigo A. Grandy; Jose Garcia-Bernardo; Paulina M. Strzelecka; Ludovic Vallier; Daniel Ortmann; Ana Cvejic

    Haematopoietic stem cells (HSCs) first arise during development in the aorta-gonad-mesonephros (AGM) region of the embryo from a population of haemogenic endothelial cells which undergo endothelial-to-haematopoietic transition (EHT). Despite the progress achieved in recent years, the molecular mechanisms driving EHT are still poorly understood, especially in human where the AGM region is not easily

  • Identification of cell type-specific methylation signals in bulk whole genome bisulfite sequencing data
    Genome Biol. (IF 10.806) Pub Date : 2020-07-01
    C. Anthony Scott; Jack D. Duryea; Harry MacKay; Maria S. Baker; Eleonora Laritsky; Chathura J. Gunasekara; Cristian Coarfa; Robert A. Waterland

    The traditional approach to studying the epigenetic mechanism CpG methylation in tissue samples is to identify regions of concordant differential methylation spanning multiple CpG sites (differentially methylated regions). Variation limited to single or small numbers of CpGs has been assumed to reflect stochastic processes. To test this, we developed software, Cluster-Based analysis of CpG methylation

  • Computational inference of cancer-specific vulnerabilities in clinical samples
    Genome Biol. (IF 10.806) Pub Date : 2020-06-29
    Kiwon Jang; Min Ji Park; Jae Soon Park; Haeun Hwangbo; Min Kyung Sung; Sinae Kim; Jaeyun Jung; Jong Won Lee; Sei-Hyun Ahn; Suhwan Chang; Jung Kyoon Choi

    Systematic in vitro loss-of-function screens provide valuable resources that can facilitate the discovery of drugs targeting cancer vulnerabilities. We develop a deep learning-based method to predict tumor-specific vulnerabilities in patient samples by leveraging a wealth of in vitro screening data. Acquired dependencies of tumors are inferred in cases in which one allele is disrupted by inactivating

  • Allele-specific DNA methylation is increased in cancers and its dense mapping in normal plus neoplastic cells increases the yield of disease-associated regulatory SNPs
    Genome Biol. (IF 10.806) Pub Date : 2020-06-29
    Catherine Do; Emmanuel L. P. Dumont; Martha Salas; Angelica Castano; Huthayfa Mujahed; Leonel Maldonado; Arunjot Singh; Sonia C. DaSilva-Arnold; Govind Bhagat; Soren Lehman; Angela M. Christiano; Subha Madhavan; Peter L. Nagy; Peter H. R. Green; Rena Feinman; Cornelia Trimble; Nicholas P. Illsley; Karen Marder; Lawrence Honig; Catherine Monk; Andre Goy; Kar Chow; Samuel Goldlust; George Kaptain; David

    Mapping of allele-specific DNA methylation (ASM) can be a post-GWAS strategy for localizing regulatory sequence polymorphisms (rSNPs). The advantages of this approach, and the mechanisms underlying ASM in normal and neoplastic cells, remain to be clarified. We perform whole genome methyl-seq on diverse normal cells and tissues and three cancer types. After excluding imprinting, the data pinpoint 15

  • Sustainable agriculture in the era of omics: knowledge-driven crop breeding
    Genome Biol. (IF 10.806) Pub Date : 2020-06-26
    Qing Li; Jianbing Yan

    Global population has reached up to 7.8 billion and is expected to exceed 10 billion by 2055 (https://countrymeters.info/cn/World). Such rapid population increase presents a great challenge for food supply. On the one hand, more grains are needed to provide basic calories for humans. On the other hand, the rising living standard leads to a changing diet habit towards higher average consumption of livestock

  • Single-cell transcriptome and antigen-immunoglobin analysis reveals the diversity of B cells in non-small cell lung cancer
    Genome Biol. (IF 10.806) Pub Date : 2020-06-24
    Jian Chen; Yun Tan; Fenghuan Sun; Likun Hou; Chi Zhang; Tao Ge; Huansha Yu; Chunxiao Wu; Yuming Zhu; Liang Duan; Liang Wu; Nan Song; Liping Zhang; Wei Zhang; Di Wang; Chang Chen; Chunyan Wu; Gening Jiang; Peng Zhang

    Malignant transformation and progression of cancer are driven by the co-evolution of cancer cells and their dysregulated tumor microenvironment (TME). Recent studies on immunotherapy demonstrate the efficacy in reverting the anti-tumoral function of T cells, highlighting the therapeutic potential in targeting certain cell types in TME. However, the functions of other immune cell types remain largely

  • Approaches for integrating heterogeneous RNA-seq data reveal cross-talk between microbes and genes in asthmatic patients.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-22
    Daniel Spakowicz,Shaoke Lou,Brian Barron,Jose L Gomez,Tianxiao Li,Qing Liu,Nicole Grant,Xiting Yan,Rebecca Hoyd,George Weinstock,Geoffrey L Chupp,Mark Gerstein

    Sputum induction is a non-invasive method to evaluate the airway environment, particularly for asthma. RNA sequencing (RNA-seq) of sputum samples can be challenging to interpret due to the complex and heterogeneous mixtures of human cells and exogenous (microbial) material. In this study, we develop a pipeline that integrates dimensionality reduction and statistical modeling to grapple with the heterogeneity

  • Enhanced Integrated Gradients: improving interpretability of deep learning models using splicing codes as a case study.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-19
    Anupama Jha,Joseph K Aicher,Matthew R Gazzara,Deependra Singh,Yoseph Barash

    Despite the success and fast adaptation of deep learning models in biomedical domains, their lack of interpretability remains an issue. Here, we introduce Enhanced Integrated Gradients (EIG), a method to identify significant features associated with a specific prediction task. Using RNA splicing prediction as well as digit classification as case studies, we demonstrate that EIG improves upon the original

  • instaGRAAL: chromosome-level quality scaffolding of genomes using a proximity ligation-based scaffolder.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-18
    Lyam Baudry,Nadège Guiglielmoni,Hervé Marie-Nelly,Alexandre Cormier,Martial Marbouty,Komlan Avia,Yann Loe Mie,Olivier Godfroy,Lieven Sterck,J Mark Cock,Christophe Zimmer,Susana M Coelho,Romain Koszul

    Hi-C exploits contact frequencies between pairs of loci to bridge and order contigs during genome assembly, resulting in chromosome-level assemblies. Because few robust programs are available for this type of data, we developed instaGRAAL, a complete overhaul of the GRAAL program, which has adapted the latter to allow efficient assembly of large genomes. instaGRAAL features a number of improvements

  • The two languages of science.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-17
    Itai Yanai,Martin Lercher

    “If we allow ourselves the license of talking about genes as if they had conscious aims, always reassuring ourselves that we could translate our sloppy language back into respectable terms if we wanted to, we can ask the question, what is a single selfish gene trying to do?”—Richard Dawkins, The Selfish Gene: p. 88 “I’ve made agents out of system 1 and system 2 because everybody finds it easier to

  • KAML: improving genomic prediction accuracy of complex traits using machine learning determined parameters.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-17
    Lilin Yin,Haohao Zhang,Xiang Zhou,Xiaohui Yuan,Shuhong Zhao,Xinyun Li,Xiaolei Liu

    Advances in high-throughput sequencing technologies have reduced the cost of genotyping dramatically and led to genomic prediction being widely used in animal and plant breeding, and increasingly in human genetics. Inspired by the efficient computing of linear mixed model and the accurate prediction of Bayesian methods, we propose a machine learning-based method incorporating cross-validation, multiple

  • Analysis of transcript-deleterious variants in Mendelian disorders: implications for RNA-based diagnostics.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-17
    Sateesh Maddirevula,Hiroyuki Kuwahara,Nour Ewida,Hanan E Shamseldin,Nisha Patel,Fatema Alzahrani,Tarfa AlSheddi,Eman AlObeid,Mona Alenazi,Hessa S Alsaif,Maha Alqahtani,Maha AlAli,Hatoon Al Ali,Rana Helaby,Niema Ibrahim,Firdous Abdulwahab,Mais Hashem,Nadine Hanna,Dorota Monies,Nada Derar,Afaf Alsagheir,Amal Alhashem,Badr Alsaleem,Hamoud Alhebbi,Sami Wali,Ramzan Umarov,Xin Gao,Fowzan S Alkuraya

    At least 50% of patients with suspected Mendelian disorders remain undiagnosed after whole-exome sequencing (WES), and the extent to which non-coding variants that are not captured by WES contribute to this fraction is unclear. Whole transcriptome sequencing is a promising supplement to WES, although empirical data on the contribution of RNA analysis to the diagnosis of Mendelian diseases on a large

  • Removal of H2Aub1 by ubiquitin-specific proteases 12 and 13 is required for stable Polycomb-mediated gene repression in Arabidopsis.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-16
    Lejon E M Kralemann,Shujing Liu,Minerva S Trejo-Arellano,Rafael Muñoz-Viana,Claudia Köhler,Lars Hennig

    Stable gene repression is essential for normal growth and development. Polycomb repressive complexes 1 and 2 (PRC1&2) are involved in this process by establishing monoubiquitination of histone 2A (H2Aub1) and subsequent trimethylation of lysine 27 of histone 3 (H3K27me3). Previous work proposed that H2Aub1 removal by the ubiquitin-specific proteases 12 and 13 (UBP12 and UBP13) is part of the repressive

  • 3D genome architecture coordinates trans and cis regulation of differentially expressed ear and tassel genes in maize.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-16
    Yonghao Sun,Liang Dong,Ying Zhang,Da Lin,Weize Xu,Changxiong Ke,Linqian Han,Lulu Deng,Guoliang Li,David Jackson,Xingwang Li,Fang Yang

    Maize ears and tassels are two separate types of inflorescence which are initiated by similar developmental processes but gradually develop distinct architectures. However, coordinated trans and cis regulation of differentially expressed genes determining ear and tassel architecture within the 3D genome context is largely unknown. We identify 56,055 and 52,633 open chromatin regions (OCRs) in developing

  • Repeat-induced point mutation in Neurospora crassa causes the highest known mutation rate and mutational burden of any cellular life.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-16
    Long Wang,Yingying Sun,Xiaoguang Sun,Luyao Yu,Lan Xue,Zhen He,Ju Huang,Dacheng Tian,Laurence D Hurst,Sihai Yang

    Repeat-induced point (RIP) mutation in Neurospora crassa degrades transposable elements by targeting repeats with C→T mutations. Whether RIP affects core genomic sequence in important ways is unknown. By parent-offspring whole genome sequencing, we estimate a mutation rate (3.38 × 10−6 per bp per generation) that is two orders of magnitude higher than reported for any non-viral organism, with 93–98%

  • SWISS: multiplexed orthogonal genome editing in plants with a Cas9 nickase and engineered CRISPR RNA scaffolds.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-16
    Chao Li,Yuan Zong,Shuai Jin,Haocheng Zhu,Dexing Lin,Shengnan Li,Jin-Long Qiu,Yanpeng Wang,Caixia Gao

    We describe here a CRISPR simultaneous and wide-editing induced by a single system (SWISS), in which RNA aptamers engineered in crRNA scaffold recruit their cognate binding proteins fused with cytidine deaminase and adenosine deaminase to Cas9 nickase target sites to generate multiplexed base editing. By using paired sgRNAs, SWISS can produce insertions/deletions in addition to base editing. Rice mutants

  • Dynamic rewiring of the human interactome by interferon signaling.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-15
    Craig H Kerr,Michael A Skinnider,Daniel D T Andrews,Angel M Madero,Queenie W T Chan,R Greg Stacey,Nikolay Stoynov,Eric Jan,Leonard J Foster

    The type I interferon (IFN) response is an ancient pathway that protects cells against viral pathogens by inducing the transcription of hundreds of IFN-stimulated genes. Comprehensive catalogs of IFN-stimulated genes have been established across species and cell types by transcriptomic and biochemical approaches, but their antiviral mechanisms remain incompletely characterized. Here, we apply a combination

  • SURF: integrative analysis of a compendium of RNA-seq and CLIP-seq datasets highlights complex governing of alternative transcriptional regulation by RNA-binding proteins.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-12
    Fan Chen,Sündüz Keleş

    Advances in high-throughput profiling of RNA-binding proteins (RBPs) have resulted inCLIP-seq datasets coupled with transcriptome profiling by RNA-seq. However, analysis methods that integrate both types of data are lacking. We describe SURF, Statistical Utility for RBP Functions, for integrative analysis of large collections of CLIP-seq and RNA-seq data. We demonstrate SURF’s ability to accurately

  • Whole-genome sequencing of glioblastoma reveals enrichment of non-coding constraint mutations in known and novel genes.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-09
    Sharadha Sakthikumar,Ananya Roy,Lulu Haseeb,Mats E Pettersson,Elisabeth Sundström,Voichita D Marinescu,Kerstin Lindblad-Toh,Karin Forsberg-Nilsson

    Glioblastoma (GBM) has one of the worst 5-year survival rates of all cancers. While genomic studies of the disease have been performed, alterations in the non-coding regulatory regions of GBM have largely remained unexplored. We apply whole-genome sequencing (WGS) to identify non-coding mutations, with regulatory potential in GBM, under the hypothesis that regions of evolutionary constraint are likely

  • Analysis of 1321 Eubacterium rectale genomes from metagenomes uncovers complex phylogeographic population structure and subspecies functional adaptations.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-08
    Nicolai Karcher,Edoardo Pasolli,Francesco Asnicar,Kun D Huang,Adrian Tett,Serena Manara,Federica Armanini,Debbie Bain,Sylvia H Duncan,Petra Louis,Moreno Zolfo,Paolo Manghi,Mireia Valles-Colomer,Roberta Raffaetà,Omar Rota-Stabelli,Maria Carmen Collado,Georg Zeller,Daniel Falush,Frank Maixner,Alan W Walker,Curtis Huttenhower,Nicola Segata

    Eubacterium rectale is one of the most prevalent human gut bacteria, but its diversity and population genetics are not well understood because large-scale whole-genome investigations of this microbe have not been carried out. Here, we leverage metagenomic assembly followed by a reference-based binning strategy to screen over 6500 gut metagenomes spanning geography and lifestyle and reconstruct over

  • CB2 improves power of cell detection in droplet-based single-cell RNA sequencing data.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-08
    Zijian Ni,Shuyang Chen,Jared Brown,Christina Kendziorski

    An important challenge in pre-processing data from droplet-based single-cell RNA sequencing protocols is distinguishing barcodes associated with real cells from those binding background reads. Existing methods test barcodes individually and consequently do not leverage the strong cell-to-cell correlation present in most datasets. To improve cell detection, we introduce CB2, a cluster-based approach

  • Direct-seq: programmed gRNA scaffold for streamlined scRNA-seq in CRISPR screen.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-08
    Qingkai Song,Ke Ni,Min Liu,Yini Li,Lixia Wang,Yingying Wang,Yingzheng Liu,Zhenxing Yu,Yinyao Qi,Zhike Lu,Lijia Ma

    CRISPR-based genome perturbation provides a new avenue to conveniently change DNA sequences, transcription, and epigenetic modifications in genetic screens. However, it remains challenging to assay the complex molecular readouts after perturbation at high resolution and at scale. By introducing an A/G mixed capture sequence into the gRNA scaffold, we demonstrate that gRNA transcripts could be directly

  • Molecular mechanisms of coronary disease revealed using quantitative trait loci for TCF21 binding, chromatin accessibility, and chromosomal looping.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-08
    Quanyi Zhao,Michael Dacre,Trieu Nguyen,Milos Pjanic,Boxiang Liu,Dharini Iyer,Paul Cheng,Robert Wirka,Juyong Brian Kim,Hunter B Fraser,Thomas Quertermous

    To investigate the epigenetic and transcriptional mechanisms of coronary artery disease (CAD) risk, as well as the functional regulation of chromatin structure and function, we create a catalog of genetic variants associated with three stages of transcriptional cis-regulation in primary human coronary artery vascular smooth muscle cells (HCASMCs). We use a pooling approach with HCASMC lines to map

  • MAUDE: inferring expression changes in sorting-based CRISPR screens.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-03
    Carl G de Boer,John P Ray,Nir Hacohen,Aviv Regev

    Improved methods are needed to model CRISPR screen data for interrogation of genetic elements that alter reporter gene expression readout. We create MAUDE (Mean Alterations Using Discrete Expression) for quantifying the impact of guide RNAs on a target gene’s expression in a pooled, sorting-based expression screen. MAUDE quantifies guide-level effects by modeling the distribution of cells across sorting

  • Author Correction: Defining the relative and combined contribution of CTCF and CTCFL to genomic regulation.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-02
    Mayilaadumveettil Nishana,Caryn Ha,Javier Rodriguez-Hernaez,Ali Ranjbaran,Erica Chio,Elphege P Nora,Sana B Badri,Andreas Kloetgen,Benoit G Bruneau,Aristotelis Tsirigos,Jane A Skok

    An amendment to this paper has been published and can be accessed via the original article.

  • Universal promoter scanning by Pol II during transcription initiation in Saccharomyces cerevisiae.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-02
    Chenxi Qiu,Huiyan Jin,Irina Vvedenskaya,Jordi Abante Llenas,Tingting Zhao,Indranil Malik,Alex M Visbisky,Scott L Schwartz,Ping Cui,Pavel Čabart,Kang Hoo Han,William K M Lai,Richard P Metz,Charles D Johnson,Sing-Hoi Sze,B Franklin Pugh,Bryce E Nickels,Craig D Kaplan

    The majority of eukaryotic promoters utilize multiple transcription start sites (TSSs). How multiple TSSs are specified at individual promoters across eukaryotes is not understood for most species. In Saccharomyces cerevisiae, a pre-initiation complex (PIC) comprised of Pol II and conserved general transcription factors (GTFs) assembles and opens DNA upstream of TSSs. Evidence from model promoters

  • The promise and challenge of cancer microbiome research.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-02
    Sumeed Syed Manzoor,Annemiek Doedens,Michael B Burns

    Many microbial agents have been implicated as contributors to cancer genesis and development, and the search to identify and characterize new cancer-related organisms is ongoing. Modern developments in methodologies, especially culture-independent approaches, have accelerated and driven this research. Recent work has shed light on the multifaceted role that the community of organisms in and on the

  • Systematic assessment of tissue dissociation and storage biases in single-cell and single-nucleus RNA-seq workflows.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-02
    Elena Denisenko,Belinda B Guo,Matthew Jones,Rui Hou,Leanne de Kock,Timo Lassmann,Daniel Poppe,Olivier Clément,Rebecca K Simmons,Ryan Lister,Alistair R R Forrest

    Single-cell RNA sequencing has been widely adopted to estimate the cellular composition of heterogeneous tissues and obtain transcriptional profiles of individual cells. Multiple approaches for optimal sample dissociation and storage of single cells have been proposed as have single-nuclei profiling methods. What has been lacking is a systematic comparison of their relative biases and benefits. Here

  • Assembly and annotation of an Ashkenazi human reference genome.
    Genome Biol. (IF 10.806) Pub Date : 2020-06-02
    Alaina Shumate,Aleksey V Zimin,Rachel M Sherman,Daniela Puiu,Justin M Wagner,Nathan D Olson,Mihaela Pertea,Marc L Salit,Justin M Zook,Steven L Salzberg

    Thousands of experiments and studies use the human reference genome as a resource each year. This single reference genome, GRCh38, is a mosaic created from a small number of individuals, representing a very small sample of the human population. There is a need for reference genomes from multiple human populations to avoid potential biases. Here, we describe the assembly and annotation of the genome

  • Developmental regulation of canonical and small ORF translation from mRNAs.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-29
    Pedro Patraquim,Muhammad Ali Shahzad Mumtaz,José Ignacio Pueyo,Julie Louise Aspden,Juan-Pablo Couso

    Ribosomal profiling has revealed the translation of thousands of sequences outside annotated protein-coding genes, including small open reading frames of less than 100 codons, and the translational regulation of many genes. Here we present an improved version of Poly-Ribo-Seq and apply it to Drosophila melanogaster embryos to extend the catalog of in vivo translated small ORFs, and to reveal the translational

  • CICERO: a versatile method for detecting complex and diverse driver fusions using cancer RNA sequencing data.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-28
    Liqing Tian,Yongjin Li,Michael N Edmonson,Xin Zhou,Scott Newman,Clay McLeod,Andrew Thrasher,Yu Liu,Bo Tang,Michael C Rusch,John Easton,Jing Ma,Eric Davis,Austyn Trull,J Robert Michael,Karol Szlachta,Charles Mullighan,Suzanne J Baker,James R Downing,David W Ellison,Jinghui Zhang

    To discover driver fusions beyond canonical exon-to-exon chimeric transcripts, we develop CICERO, a local assembly-based algorithm that integrates RNA-seq read support with extensive annotation for candidate ranking. CICERO outperforms commonly used methods, achieving a 95% detection rate for 184 independently validated driver fusions including internal tandem duplications and other non-canonical events

  • FORK-seq: replication landscape of the Saccharomyces cerevisiae genome by nanopore sequencing.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-26
    Magali Hennion,Jean-Michel Arbona,Laurent Lacroix,Corinne Cruaud,Bertrand Theulot,Benoît Le Tallec,Florence Proux,Xia Wu,Elizaveta Novikova,Stefan Engelen,Arnaud Lemainque,Benjamin Audit,Olivier Hyrien

    Genome replication mapping methods profile cell populations, masking cell-to-cell heterogeneity. Here, we describe FORK-seq, a nanopore sequencing method to map replication of single DNA molecules at 200-nucleotide resolution. By quantifying BrdU incorporation along pulse-chased replication intermediates from Saccharomyces cerevisiae, we orient 58,651 replication tracks reproducing population-based

  • Personalized and graph genomes reveal missing signal in epigenomic data.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-25
    Cristian Groza,Tony Kwan,Nicole Soranzo,Tomi Pastinen,Guillaume Bourque

    Epigenomic studies that use next generation sequencing experiments typically rely on the alignment of reads to a reference sequence. However, because of genetic diversity and the diploid nature of the human genome, we hypothesize that using a generic reference could lead to incorrectly mapped reads and bias downstream results. We show that accounting for genetic variation using a modified reference

  • Accounting for cell type hierarchy in evaluating single cell RNA-seq clustering.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-25
    Zhijin Wu,Hao Wu

    Cell clustering is one of the most common routines in single cell RNA-seq data analyses, for which a number of specialized methods are available. The evaluation of these methods ignores an important biological characteristic that the structure for a population of cells is hierarchical, which could result in misleading evaluation results. In this work, we develop two new metrics that take into account

  • Lifestyle and the presence of helminths is associated with gut microbiome composition in Cameroonians.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-25
    Meagan A Rubel,Arwa Abbas,Louis J Taylor,Andrew Connell,Ceylan Tanes,Kyle Bittinger,Valantine N Ndze,Julius Y Fonsah,Eric Ngwang,André Essiane,Charles Fokunang,Alfred K Njamnshi,Frederic D Bushman,Sarah A Tishkoff

    African populations provide a unique opportunity to interrogate host-microbe co-evolution and its impact on adaptive phenotypes due to their genomic, phenotypic, and cultural diversity. We integrate gut microbiome 16S rRNA amplicon and shotgun metagenomic sequence data with quantification of pathogen burden and measures of immune parameters for 575 ethnically diverse Africans from Cameroon. Subjects

  • Gapless assembly of maize chromosomes using long-read technologies.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-20
    Jianing Liu,Arun S Seetharam,Kapeel Chougule,Shujun Ou,Kyle W Swentowsky,Jonathan I Gent,Victor Llaca,Margaret R Woodhouse,Nancy Manchanda,Gernot G Presting,David A Kudrna,Magdy Alabady,Candice N Hirsch,Kevin A Fengler,Doreen Ware,Todd P Michael,Matthew B Hufford,R Kelly Dawe

    Creating gapless telomere-to-telomere assemblies of complex genomes is one of the ultimate challenges in genomics. We use two independent assemblies and an optical map-based merging pipeline to produce a maize genome (B73-Ab10) composed of 63 contigs and a contig N50 of 162 Mb. This genome includes gapless assemblies of chromosome 3 (236 Mb) and chromosome 9 (162 Mb), and 53 Mb of the Ab10 meiotic

  • RNA structural dynamics regulate early embryogenesis through controlling transcriptome fate and function.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-18
    Boyang Shi,Jinsong Zhang,Jian Heng,Jing Gong,Ting Zhang,Pan Li,Bao-Fa Sun,Ying Yang,Ning Zhang,Yong-Liang Zhao,Hai-Lin Wang,Feng Liu,Qiangfeng Cliff Zhang,Yun-Gui Yang

    BACKGROUND Vertebrate early embryogenesis is initially directed by a set of maternal RNAs and proteins, yet the mechanisms controlling this program remain largely unknown. Recent transcriptome-wide studies on RNA structure have revealed its pervasive and crucial roles in RNA processing and functions, but whether and how RNA structure regulates the fate of the maternal transcriptome have yet to be determined

  • tappAS: a comprehensive computational framework for the analysis of the functional impact of differential splicing.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-18
    Lorena de la Fuente,Ángeles Arzalluz-Luque,Manuel Tardáguila,Héctor Del Risco,Cristina Martí,Sonia Tarazona,Pedro Salguero,Raymond Scott,Alberto Lerma,Ana Alastrue-Agudo,Pablo Bonilla,Jeremy R B Newman,Shunichi Kosugi,Lauren M McIntyre,Victoria Moreno-Manzano,Ana Conesa

    Recent advances in long-read sequencing solve inaccuracies in alternative transcript identification of full-length transcripts in short-read RNA-Seq data, which encourages the development of methods for isoform-centered functional analysis. Here, we present tappAS, the first framework to enable a comprehensive Functional Iso-Transcriptomics (FIT) analysis, which is effective at revealing the functional

  • Protection from DNA re-methylation by transcription factors in primordial germ cells and pre-implantation embryos can explain trans-generational epigenetic inheritance.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-18
    Isaac Kremsky,Victor G Corces

    BACKGROUND A growing body of evidence suggests that certain epiphenotypes can be passed across generations via both the male and female germlines of mammals. These observations have been difficult to explain owing to a global loss of the majority of known epigenetic marks present in parental chromosomes during primordial germ cell development and after fertilization. RESULTS By integrating previously

  • BpForms and BcForms: a toolkit for concretely describing non-canonical polymers and complexes to facilitate global biochemical networks.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-18
    Paul F Lang,Yassmine Chebaro,Xiaoyue Zheng,John A P Sekar,Bilal Shaikh,Darren A Natale,Jonathan R Karr

    Non-canonical residues, caps, crosslinks, and nicks are important to many functions of DNAs, RNAs, proteins, and complexes. However, we do not fully understand how networks of such non-canonical macromolecules generate behavior. One barrier is our limited formats for describing macromolecules. To overcome this barrier, we develop BpForms and BcForms, a toolkit for representing the primary structure

  • APEC: an accesson-based method for single-cell chromatin accessibility analysis.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-12
    Bin Li,Young Li,Kun Li,Lianbang Zhu,Qiaoni Yu,Pengfei Cai,Jingwen Fang,Wen Zhang,Pengcheng Du,Chen Jiang,Jun Lin,Kun Qu

    The development of sequencing technologies has promoted the survey of genome-wide chromatin accessibility at single-cell resolution. However, comprehensive analysis of single-cell epigenomic profiles remains a challenge. Here, we introduce an accessibility pattern-based epigenomic clustering (APEC) method, which classifies each cell by groups of accessible regions with synergistic signal patterns termed

  • Terminating contamination: large-scale search identifies more than 2,000,000 contaminated entries in GenBank.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-12
    Martin Steinegger,Steven L Salzberg

    Genomic analyses are sensitive to contamination in public databases caused by incorrectly labeled reference sequences. Here, we describe Conterminator, an efficient method to detect and remove incorrectly labeled sequences by an exhaustive all-against-all sequence comparison. Our analysis reports contamination of 2,161,746, 114,035, and 14,148 sequences in the RefSeq, GenBank, and NR databases, respectively

  • Insights gained from a comprehensive all-against-all transcription factor binding motif benchmarking study.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-11
    Giovanna Ambrosini,Ilya Vorontsov,Dmitry Penzar,Romain Groux,Oriol Fornes,Daria D Nikolaeva,Benoit Ballester,Jan Grau,Ivo Grosse,Vsevolod Makeev,Ivan Kulakovskiy,Philipp Bucher

    BACKGROUND Positional weight matrix (PWM) is a de facto standard model to describe transcription factor (TF) DNA binding specificities. PWMs inferred from in vivo or in vitro data are stored in many databases and used in a plethora of biological applications. This calls for comprehensive benchmarking of public PWM models with large experimental reference sets. RESULTS Here we report results from all-against-all

  • Effects of the COVID-19 pandemic on life scientists.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-11
    Jan O Korbel,Oliver Stegle

    We will not know the long-term impact of the SARS-CoV-2 viral outbreak for some time yet, but many of us have already begun to feel the effects—not only on our daily lives but also on our work as life scientists. With partial or complete institutional shutdowns in countries worldwide, the global COVID-19 health crisis has rapidly impacted the life science landscape, including our patterns of work.

  • Sampling time-dependent artifacts in single-cell genomics studies.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-11
    Ramon Massoni-Badosa,Giovanni Iacono,Catia Moutinho,Marta Kulis,Núria Palau,Domenica Marchese,Javier Rodríguez-Ubreva,Esteban Ballestar,Gustavo Rodriguez-Esteban,Sara Marsal,Marta Aymerich,Dolors Colomer,Elias Campo,Antonio Julià,José Ignacio Martín-Subero,Holger Heyn

    Robust protocols and automation now enable large-scale single-cell RNA and ATAC sequencing experiments and their application on biobank and clinical cohorts. However, technical biases introduced during sample acquisition can hinder solid, reproducible results, and a systematic benchmarking is required before entering large-scale data production. Here, we report the existence and extent of gene expression

  • MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-11
    Ricard Argelaguet,Damien Arnol,Danila Bredikhin,Yonatan Deloro,Britta Velten,John C Marioni,Oliver Stegle

    Technological advances have enabled the profiling of multiple molecular layers at single-cell resolution, assaying cells from multiple samples or conditions. Consequently, there is a growing need for computational strategies to analyze data from complex experimental designs that include multiple data modalities and multiple groups of samples. We present Multi-Omics Factor Analysis v2 (MOFA+), a statistical

  • Chromatin topology reorganization and transcription repression by PML-RARα in acute promyeloid leukemia.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-11
    Ping Wang,Zhonghui Tang,Byoungkoo Lee,Jacqueline Jufen Zhu,Liuyang Cai,Przemyslaw Szalaj,Simon Zhongyuan Tian,Meizhen Zheng,Dariusz Plewczynski,Xiaoan Ruan,Edison T Liu,Chia-Lin Wei,Yijun Ruan

    BACKGROUND Acute promyeloid leukemia (APL) is characterized by the oncogenic fusion protein PML-RARα, a major etiological agent in APL. However, the molecular mechanisms underlying the role of PML-RARα in leukemogenesis remain largely unknown. RESULTS Using an inducible system, we comprehensively analyze the 3D genome organization in myeloid cells and its reorganization after PML-RARα induction and

  • Compressing gene expression data using multiple latent space dimensionalities learns complementary biological representations.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-11
    Gregory P Way,Michael Zietz,Vincent Rubinetti,Daniel S Himmelstein,Casey S Greene

    BACKGROUND Unsupervised compression algorithms applied to gene expression data extract latent or hidden signals representing technical and biological sources of variation. However, these algorithms require a user to select a biologically appropriate latent space dimensionality. In practice, most researchers fit a single algorithm and latent dimensionality. We sought to determine the extent by which

  • Defining the relative and combined contribution of CTCF and CTCFL to genomic regulation.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-11
    Mayilaadumveettil Nishana,Caryn Ha,Javier Rodriguez-Hernaez,Ali Ranjbaran,Erica Chio,Elphege P Nora,Sana B Badri,Andreas Kloetgen,Benoit G Bruneau,Aristotelis Tsirigos,Jane A Skok

    BACKGROUND Ubiquitously expressed CTCF is involved in numerous cellular functions, such as organizing chromatin into TAD structures. In contrast, its paralog, CTCFL, is normally only present in the testis. However, it is also aberrantly expressed in many cancers. While it is known that shared and unique zinc finger sequences in CTCF and CTCFL enable CTCFL to bind competitively to a subset of CTCF binding

  • A human lung tumor microenvironment interactome identifies clinically relevant cell-type cross-talk.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-07
    Andrew J Gentles,Angela Bik-Yu Hui,Weiguo Feng,Armon Azizi,Ramesh V Nair,Gina Bouchard,David A Knowles,Alice Yu,Youngtae Jeong,Alborz Bejnood,Erna Forgó,Sushama Varma,Yue Xu,Amanda Kuong,Viswam S Nair,Rob West,Matt van de Rijn,Chuong D Hoang,Maximilian Diehn,Sylvia K Plevritis

    BACKGROUND Tumors comprise a complex microenvironment of interacting malignant and stromal cell types. Much of our understanding of the tumor microenvironment comes from in vitro studies isolating the interactions between malignant cells and a single stromal cell type, often along a single pathway. RESULT To develop a deeper understanding of the interactions between cells within human lung tumors,

  • Integrative analyses of the RNA modification machinery reveal tissue- and cancer-specific signatures.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-07
    Oguzhan Begik,Morghan C Lucas,Huanle Liu,Jose Miguel Ramirez,John S Mattick,Eva Maria Novoa

    BACKGROUND RNA modifications play central roles in cellular fate and differentiation. However, the machinery responsible for placing, removing, and recognizing more than 170 RNA modifications remains largely uncharacterized and poorly annotated, and we currently lack integrative studies that identify which RNA modification-related proteins (RMPs) may be dysregulated in each cancer type. RESULTS Here

  • Single-cell RNA-seq with spike-in cells enables accurate quantification of cell-specific drug effects in pancreatic islets.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-06
    Brenda Marquina-Sanchez,Nikolaus Fortelny,Matthias Farlik,Andhira Vieira,Patrick Collombat,Christoph Bock,Stefan Kubicek

    BACKGROUND Single-cell RNA-seq (scRNA-seq) is emerging as a powerful tool to dissect cell-specific effects of drug treatment in complex tissues. This application requires high levels of precision, robustness, and quantitative accuracy-beyond those achievable with existing methods for mainly qualitative single-cell analysis. Here, we establish the use of standardized reference cells as spike-in controls

  • What are innovations in peer review and editorial assessment for?
    Genome Biol. (IF 10.806) Pub Date : 2020-05-04
    Willem Halffman,Serge P J M Horbach

    Peer review at research journals is going through a period of intense innovation. Some journals are experimenting with ‘open’ review procedures that reveal identities or even review reports; some with pre-registered reports that shift review attention to experimental protocols rather than to focus on results; or with post-publication review through readership commentary [1]. Well-resourced journals

  • Open access, open data and peer review.
    Genome Biol. (IF 10.806) Pub Date : 2020-05-04
    Jernej Ule

    I am very fond of open access journals like Genome Biology. Another champion of such journals is Plan S, launched by Science Europe in 2018 and adopted by many funding agencies, which aims to encourage scientists to publish in open access journals or platforms [1]. I touch on some recent discussion of this topic and then highlight the need to link it to accessible raw and processed data associated

  • In memory of James Taylor: the birth of Galaxy.
    Genome Biol. (IF 10.806) Pub Date : 2020-04-30
    Anton Nekrutenko,Michael C Schatz

    James Peter Taylor, the Ralph S. O’Connor Professor of Biology and Computer Science at Johns Hopkins University (JHU), passed away on April 2, 2020. He was 40 years old. James was an exceptional scientist, colleague, mentor, and community builder, who worked at the intersection of biology and computer science. His life’s pursuit was to understand how genomic and epigenomic information is processed

  • Wheat chromatin architecture is organized in genome territories and transcription factories.
    Genome Biol. (IF 10.806) Pub Date : 2020-04-29
    Lorenzo Concia,Alaguraj Veluchamy,Juan S Ramirez-Prado,Azahara Martin-Ramirez,Ying Huang,Magali Perez,Severine Domenichini,Natalia Y Rodriguez Granados,Soonkap Kim,Thomas Blein,Susan Duncan,Clement Pichot,Deborah Manza-Mianza,Caroline Juery,Etienne Paux,Graham Moore,Heribert Hirt,Catherine Bergounioux,Martin Crespi,Magdy M Mahfouz,Abdelhafid Bendahmane,Chang Liu,Anthony Hall,Cécile Raynaud,David Latrasse

    BACKGROUND Polyploidy is ubiquitous in eukaryotic plant and fungal lineages, and it leads to the co-existence of several copies of similar or related genomes in one nucleus. In plants, polyploidy is considered a major factor in successful domestication. However, polyploidy challenges chromosome folding architecture in the nucleus to establish functional structures. RESULTS We examine the hexaploid

Contents have been reproduced by permission of the publishers.
ACS ES&T Engineering
ACS ES&T Water