Whole-genome analysis of noncoding genetic variations identifies multiscale regulatory element perturbations associated with Hirschsprung disease

  1. Kevin Y. Yip1,7,8
  1. 1Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong;
  2. 2Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong;
  3. 3Dr. Li Dak-Sum Research Centre, The University of Hong Kong, Hong Kong;
  4. 4School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong;
  5. 5Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong;
  6. 6Centre for Genomic Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong;
  7. 7Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Hong Kong;
  8. 8Hong Kong Institute of Diabetes and Obesity, The Chinese University of Hong Kong, Hong Kong
  1. 9 These authors contributed equally to this work.

  • Corresponding authors: paultam{at}hku.hk, engan{at}hku.hk, kevinyip{at}cse.cuhk.edu.hk
  • Abstract

    It is widely recognized that noncoding genetic variants play important roles in many human diseases, but there are multiple challenges that hinder the identification of functional disease-associated noncoding variants. The number of noncoding variants can be many times that of coding variants; many of them are not functional but in linkage disequilibrium with the functional ones; different variants can have epistatic effects; different variants can affect the same genes or pathways in different individuals; and some variants are related to each other not by affecting the same gene but by affecting the binding of the same upstream regulator. To overcome these difficulties, we propose a novel analysis framework that considers convergent impacts of different genetic variants on protein binding, which provides multiscale information about disease-associated perturbations of regulatory elements, genes, and pathways. Applying it to our whole-genome sequencing data of 918 short-segment Hirschsprung disease patients and matched controls, we identify various novel genes not detected by standard single-variant and region-based tests, functionally centering on neural crest migration and development. Our framework also identifies upstream regulators whose binding is influenced by the noncoding variants. Using human neural crest cells, we confirm cell stage–specific regulatory roles of three top novel regulatory elements on our list, respectively in the RET, RASGEF1A, and PIK3C2B loci. In the PIK3C2B regulatory element, we further show that a noncoding variant found only in the patients affects the binding of the gliogenesis regulator NFIA, with a corresponding up-regulation of multiple genes in the same topologically associating domain.

    Footnotes

    • Received April 8, 2020.
    • Accepted September 14, 2020.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    | Table of Contents

    Preprint Server