当前期刊: Genome Medicine Go to current issue    加入关注   
显示样式:        排序: 导出
  • Genomic surveillance for hypervirulence and multi-drug resistance in invasive Klebsiella pneumoniae from South and Southeast Asia
    Genome Med. (IF 10.886) Pub Date : 2020-01-16
    Kelly L. Wyres; To N. T. Nguyen; Margaret M. C. Lam; Louise M. Judd; Nguyen van Vinh Chau; David A. B. Dance; Margaret Ip; Abhilasha Karkey; Clare L. Ling; Thyl Miliya; Paul N. Newton; Nguyen Phu Huong Lan; Amphone Sengduangphachanh; Paul Turner; Balaji Veeraraghavan; Phat Voong Vinh; Manivanh Vongsouvath; Nicholas R. Thomson; Stephen Baker; Kathryn E. Holt

    Klebsiella pneumoniae is a leading cause of bloodstream infection (BSI). Strains producing extended-spectrum beta-lactamases (ESBLs) or carbapenemases are considered global priority pathogens for which new treatment and prevention strategies are urgently required, due to severely limited therapeutic options. South and Southeast Asia are major hubs for antimicrobial-resistant (AMR) K. pneumoniae and also for the characteristically antimicrobial-sensitive, community-acquired “hypervirulent” strains. The emergence of hypervirulent AMR strains and lack of data on exopolysaccharide diversity pose a challenge for K. pneumoniae BSI control strategies worldwide. We conducted a retrospective genomic epidemiology study of 365 BSI K. pneumoniae from seven major healthcare facilities across South and Southeast Asia, extracting clinically relevant information (AMR, virulence, K and O antigen loci) using Kleborate, a K. pneumoniae-specific genomic typing tool. K. pneumoniae BSI isolates were highly diverse, comprising 120 multi-locus sequence types (STs) and 63 K-loci. ESBL and carbapenemase gene frequencies were 47% and 17%, respectively. The aerobactin synthesis locus (iuc), associated with hypervirulence, was detected in 28% of isolates. Importantly, 7% of isolates harboured iuc plus ESBL and/or carbapenemase genes. The latter represent genotypic AMR-virulence convergence, which is generally considered a rare phenomenon but was particularly common among South Asian BSI (17%). Of greatest concern, we identified seven novel plasmids carrying both iuc and AMR genes, raising the prospect of co-transfer of these phenotypes among K. pneumoniae. K. pneumoniae BSI in South and Southeast Asia are caused by different STs from those predominating in other regions, and with higher frequency of acquired virulence determinants. K. pneumoniae carrying both iuc and AMR genes were also detected at higher rates than have been reported elsewhere. The study demonstrates how genomics-based surveillance—reporting full molecular profiles including STs, AMR, virulence and serotype locus information—can help standardise comparisons between sites and identify regional differences in pathogen populations.

  • De novo variants in exomes of congenital heart disease patients identify risk genes and pathways
    Genome Med. (IF 10.886) Pub Date : 2020-01-15
    Cigdem Sevim Bayrak; Peng Zhang; Martin Tristani-Firouzi; Bruce D. Gelb; Yuval Itan

    Congenital heart disease (CHD) affects ~ 1% of live births and is the most common birth defect. Although the genetic contribution to the CHD has been long suspected, it has only been well established recently. De novo variants are estimated to contribute to approximately 8% of sporadic CHD. CHD is genetically heterogeneous, making pathway enrichment analysis an effective approach to explore and statistically validate CHD-associated genes. In this study, we performed novel gene and pathway enrichment analyses of high-impact de novo variants in the recently published whole-exome sequencing (WES) data generated from a cohort of CHD 2645 parent-offspring trios to identify new CHD-causing candidate genes and mutations. We performed rigorous variant- and gene-level filtrations to identify potentially damaging variants, followed by enrichment analyses and gene prioritization. Our analyses revealed 23 novel genes that are likely to cause CHD, including HSP90AA1, ROCK2, IQGAP1, and CHD4, and sharing biological functions, pathways, molecular interactions, and properties with known CHD-causing genes. Ultimately, these findings suggest novel genes that are likely to be contributing to CHD pathogenesis.

  • Molecular profiling for precision cancer therapies
    Genome Med. (IF 10.886) Pub Date : 2020-01-14
    Eoghan R. Malone; Marc Oliva; Peter J. B. Sabatini; Tracy L. Stockley; Lillian L. Siu

    The number of druggable tumor-specific molecular aberrations has grown substantially in the past decade, with a significant survival benefit obtained from biomarker matching therapies in several cancer types. Molecular pathology has therefore become fundamental not only to inform on tumor diagnosis and prognosis but also to drive therapeutic decisions in daily practice. The introduction of next-generation sequencing technologies and the rising number of large-scale tumor molecular profiling programs across institutions worldwide have revolutionized the field of precision oncology. As comprehensive genomic analyses become increasingly available in both clinical and research settings, healthcare professionals are faced with the complex tasks of result interpretation and translation. This review summarizes the current and upcoming approaches to implement precision cancer medicine, highlighting the challenges and potential solutions to facilitate the interpretation and to maximize the clinical utility of molecular profiling results. We describe novel molecular characterization strategies beyond tumor DNA sequencing, such as transcriptomics, immunophenotyping, epigenetic profiling, and single-cell analyses. We also review current and potential applications of liquid biopsies to evaluate blood-based biomarkers, such as circulating tumor cells and circulating nucleic acids. Last, lessons learned from the existing limitations of genotype-derived therapies provide insights into ways to expand precision medicine beyond genomics.

  • An unsupervised learning approach to identify novel signatures of health and disease from multimodal data
    Genome Med. (IF 10.886) Pub Date : 2020-01-10
    Ilan Shomorony; Elizabeth T. Cirulli; Lei Huang; Lori A. Napier; Robyn R. Heister; Michael Hicks; Isaac V. Cohen; Hung-Chun Yu; Christine Leon Swisher; Natalie M. Schenker-Ahmed; Weizhong Li; Karen E. Nelson; Pamila Brar; Andrew M. Kahn; Timothy D. Spector; C. Thomas Caskey; J. Craig Venter; David S. Karow; Ewen F. Kirkness; Naisha Shah

    Modern medicine is rapidly moving towards a data-driven paradigm based on comprehensive multimodal health assessments. Integrated analysis of data from different modalities has the potential of uncovering novel biomarkers and disease signatures. We collected 1385 data features from diverse modalities, including metabolome, microbiome, genetics, and advanced imaging, from 1253 individuals and from a longitudinal validation cohort of 1083 individuals. We utilized a combination of unsupervised machine learning methods to identify multimodal biomarker signatures of health and disease risk. Our method identified a set of cardiometabolic biomarkers that goes beyond standard clinical biomarkers. Stratification of individuals based on the signatures of these biomarkers identified distinct subsets of individuals with similar health statuses. Subset membership was a better predictor for diabetes than established clinical biomarkers such as glucose, insulin resistance, and body mass index. The novel biomarkers in the diabetes signature included 1-stearoyl-2-dihomo-linolenoyl-GPC and 1-(1-enyl-palmitoyl)-2-oleoyl-GPC. Another metabolite, cinnamoylglycine, was identified as a potential biomarker for both gut microbiome health and lean mass percentage. We identified potential early signatures for hypertension and a poor metabolic health outcome. Additionally, we found novel associations between a uremic toxin, p-cresol sulfate, and the abundance of the microbiome genera Intestinimonas and an unclassified genus in the Erysipelotrichaceae family. Our methodology and results demonstrate the potential of multimodal data integration, from the identification of novel biomarker signatures to a data-driven stratification of individuals into disease subtypes and stages—an essential step towards personalized, preventative health risk assessment.

  • Strains used in whole organism Plasmodium falciparum vaccine trials differ in genome structure, sequence, and immunogenic potential
    Genome Med. (IF 10.886) Pub Date : 2020-01-08
    Kara A. Moser; Elliott F. Drábek; Ankit Dwivedi; Emily M. Stucke; Jonathan Crabtree; Antoine Dara; Zalak Shah; Matthew Adams; Tao Li; Priscila T. Rodrigues; Sergey Koren; Adam M. Phillippy; James B. Munro; Amed Ouattara; Benjamin C. Sparklin; Julie C. Dunning Hotopp; Kirsten E. Lyke; Lisa Sadzewicz; Luke J. Tallon; Michele D. Spring; Krisada Jongsakul; Chanthap Lon; David L. Saunders; Marcelo U. Ferreira; Myaing M. Nyunt; Miriam K. Laufer; Mark A. Travassos; Robert W. Sauerwein; Shannon Takala-Harrison; Claire M. Fraser; B. Kim Lee Sim; Stephen L. Hoffman; Christopher V. Plowe; Joana C. Silva

    Plasmodium falciparum (Pf) whole-organism sporozoite vaccines have been shown to provide significant protection against controlled human malaria infection (CHMI) in clinical trials. Initial CHMI studies showed significantly higher durable protection against homologous than heterologous strains, suggesting the presence of strain-specific vaccine-induced protection. However, interpretation of these results and understanding of their relevance to vaccine efficacy have been hampered by the lack of knowledge on genetic differences between vaccine and CHMI strains, and how these strains are related to parasites in malaria endemic regions. Whole genome sequencing using long-read (Pacific Biosciences) and short-read (Illumina) sequencing platforms was conducted to generate de novo genome assemblies for the vaccine strain, NF54, and for strains used in heterologous CHMI (7G8 from Brazil, NF166.C8 from Guinea, and NF135.C10 from Cambodia). The assemblies were used to characterize sequences in each strain relative to the reference 3D7 (a clone of NF54) genome. Strains were compared to each other and to a collection of clinical isolates (sequenced as part of this study or from public repositories) from South America, sub-Saharan Africa, and Southeast Asia. While few variants were detected between 3D7 and NF54, we identified tens of thousands of variants between NF54 and the three heterologous strains. These variants include SNPs, indels, and small structural variants that fall in regulatory and immunologically important regions, including transcription factors (such as PfAP2-L and PfAP2-G) and pre-erythrocytic antigens that may be key for sporozoite vaccine-induced protection. Additionally, these variants directly contributed to diversity in immunologically important regions of the genomes as detected through in silico CD8+ T cell epitope predictions. Of all heterologous strains, NF135.C10 had the highest number of unique predicted epitope sequences when compared to NF54. Comparison to global clinical isolates revealed that these four strains are representative of their geographic origin despite long-term culture adaptation; of note, NF135.C10 is from an admixed population, and not part of recently formed subpopulations resistant to artemisinin-based therapies present in the Greater Mekong Sub-region. These results will assist in the interpretation of vaccine efficacy of whole-organism vaccines against homologous and heterologous CHMI.

  • An epigenome-wide association study of sex-specific chronological ageing
    Genome Med. (IF 10.886) Pub Date : 2019-12-31
    Daniel L. McCartney; Futao Zhang; Robert F. Hillary; Qian Zhang; Anna J. Stevenson; Rosie M. Walker; Mairead L. Bermingham; Thibaud Boutin; Stewart W. Morris; Archie Campbell; Alison D. Murray; Heather C. Whalley; David J. Porteous; Caroline Hayward; Kathryn L. Evans; Tamir Chandra; Ian J. Deary; Andrew M. McIntosh; Jian Yang; Peter M. Visscher; Allan F. McRae; Riccardo E. Marioni

    Advanced age is associated with cognitive and physical decline and is a major risk factor for a multitude of disorders. There is also a gap in life expectancy between males and females. DNA methylation differences have been shown to be associated with both age and sex. Here, we investigate age-by-sex differences in blood-based DNA methylation in an unrelated cohort of 2586 individuals between the ages of 18 and 87 years, with replication in a further 4450 individuals between the ages of 18 and 93 years. Linear regression models were applied, with stringent genome-wide significance thresholds (p < 3.6 × 10−8) used in both the discovery and replication data. A second, highly conservative mixed linear model method that better controls the false-positive rate was also applied, using the same genome-wide significance thresholds. Using the linear regression method, 52 autosomal and 597 X-linked CpG sites, mapping to 251 unique genes, replicated with concordant effect size directions in the age-by-sex interaction analysis. The site with the greatest difference mapped to GAGE10, an X-linked gene. Here, DNA methylation levels remained stable across the male adult age range (DNA methylation by age r = 0.02) but decreased across female adult age range (DNA methylation by age r = − 0.61). One site (cg23722529) with a significant age-by-sex interaction also had a quantitative trait locus (rs17321482) that is a genome-wide significant variant for prostate cancer. The mixed linear model method identified 11 CpG sites associated with the age-by-sex interaction. The majority of differences in age-associated DNA methylation trajectories between sexes are present on the X chromosome. Several of these differences occur within genes that have been implicated in sexually dimorphic traits.

  • Exome sequencing reveals a high prevalence of BRCA1 and BRCA2 founder variants in a diverse population-based biobank
    Genome Med. (IF 10.886) Pub Date : 2019-12-31
    Noura S. Abul-Husn; Emily R. Soper; Jacqueline A. Odgis; Sinead Cullina; Dean Bobo; Arden Moscati; Jessica E. Rodriguez; Ruth J. F. Loos; Judy H. Cho; Gillian M. Belbin; Sabrina A. Suckiel; Eimear E. Kenny

    Pathogenic variants in BRCA1 and BRCA2 (BRCA1/2) lead to increased risk of breast, ovarian, and other cancers, but most variant-positive individuals in the general population are unaware of their risk, and little is known about prevalence in non-European populations. We investigated BRCA1/2 prevalence and impact in the electronic health record (EHR)-linked BioMe Biobank in New York City. Exome sequence data from 30,223 adult BioMe participants were evaluated for pathogenic variants in BRCA1/2. Prevalence estimates were made in population groups defined by genetic ancestry and self-report. EHR data were used to evaluate clinical characteristics of variant-positive individuals. There were 218 (0.7%) individuals harboring expected pathogenic variants, resulting in an overall prevalence of 1 in 139. The highest prevalence was in individuals with Ashkenazi Jewish (AJ; 1 in 49), Filipino and other Southeast Asian (1 in 81), and non-AJ European (1 in 103) ancestry. Among 218 variant-positive individuals, 112 (51.4%) harbored known founder variants: 80 had AJ founder variants (BRCA1 c.5266dupC and c.68_69delAG, and BRCA2 c.5946delT), 8 had a Puerto Rican founder variant (BRCA2 c.3922G>T), and 24 had one of 19 other founder variants. Non-European populations were more likely to harbor BRCA1/2 variants that were not classified in ClinVar or that had uncertain or conflicting evidence for pathogenicity (uncertain/conflicting). Within mixed ancestry populations, such as Hispanic/Latinos with genetic ancestry from Africa, Europe, and the Americas, there was a strong correlation between the proportion of African genetic ancestry and the likelihood of harboring an uncertain/conflicting variant. Approximately 28% of variant-positive individuals had a personal history, and 45% had a personal or family history of BRCA1/2-associated cancers. Approximately 27% of variant-positive individuals had prior clinical genetic testing for BRCA1/2. However, individuals with AJ founder variants were twice as likely to have had a clinical test (39%) than those with other pathogenic variants (20%). These findings deepen our knowledge about BRCA1/2 variants and associated cancer risk in diverse populations, indicate a gap in knowledge about potential cancer-related variants in non-European populations, and suggest that genomic screening in diverse patient populations may be an effective tool to identify at-risk individuals.

  • Recommendations for application of the functional evidence PS3/BS3 criterion using the ACMG/AMP sequence variant interpretation framework
    Genome Med. (IF 10.886) Pub Date : 2019-12-31
    Sarah E. Brnich; Ahmad N. Abou Tayoun; Fergus J. Couch; Garry R. Cutting; Marc S. Greenblatt; Christopher D. Heinen; Dona M. Kanavy; Xi Luo; Shannon M. McNulty; Lea M. Starita; Sean V. Tavtigian; Matt W. Wright; Steven M. Harrison; Leslie G. Biesecker; Jonathan S. Berg

    The American College of Medical Genetics and Genomics (ACMG)/Association for Molecular Pathology (AMP) clinical variant interpretation guidelines established criteria for different types of evidence. This includes the strong evidence codes PS3 and BS3 for “well-established” functional assays demonstrating a variant has abnormal or normal gene/protein function, respectively. However, they did not provide detailed guidance on how functional evidence should be evaluated, and differences in the application of the PS3/BS3 codes are a contributor to variant interpretation discordance between laboratories. This recommendation seeks to provide a more structured approach to the assessment of functional assays for variant interpretation and guidance on the use of various levels of strength based on assay validation. The Clinical Genome Resource (ClinGen) Sequence Variant Interpretation (SVI) Working Group used curated functional evidence from ClinGen Variant Curation Expert Panel-developed rule specifications and expert opinions to refine the PS3/BS3 criteria over multiple in-person and virtual meetings. We estimated the odds of pathogenicity for assays using various numbers of variant controls to determine the minimum controls required to reach moderate level evidence. Feedback from the ClinGen Steering Committee and outside experts were incorporated into the recommendations at multiple stages of development. The SVI Working Group developed recommendations for evaluators regarding the assessment of the clinical validity of functional data and a four-step provisional framework to determine the appropriate strength of evidence that can be applied in clinical variant interpretation. These steps are as follows: (1) define the disease mechanism, (2) evaluate the applicability of general classes of assays used in the field, (3) evaluate the validity of specific instances of assays, and (4) apply evidence to individual variant interpretation. We found that a minimum of 11 total pathogenic and benign variant controls are required to reach moderate-level evidence in the absence of rigorous statistical analysis. The recommendations and approach to functional evidence evaluation described here should help clarify the clinical variant interpretation process for functional assays. Further, we hope that these recommendations will help develop productive partnerships with basic scientists who have developed functional assays that are useful for interrogating the function of a variety of genes.

  • Digital twins to personalize medicine
    Genome Med. (IF 10.886) Pub Date : 2019-12-31
    Bergthor Björnsson; Carl Borrebaeck; Nils Elander; Thomas Gasslander; Danuta R. Gawel; Mika Gustafsson; Rebecka Jörnsten; Eun Jung Lee; Xinxiu Li; Sandra Lilja; David Martínez-Enguita; Andreas Matussek; Per Sandström; Samuel Schäfer; Margaretha Stenmarker; X. F. Sun; Oleg Sysoev; Huan Zhang; Mikael Benson

    Personalized medicine requires the integration and processing of vast amounts of data. Here, we propose a solution to this challenge that is based on constructing Digital Twins. These are high-resolution models of individual patients that are computationally treated with thousands of drugs to find the drug that is optimal for the patient.

  • Keeping up with the genomes: scaling genomic variant interpretation
    Genome Med. (IF 10.886) Pub Date : 2019-12-31
    Heidi L. Rehm; Douglas M. Fowler

    In the past 10 years, we have seen major advances in our ability to read human genomic DNA and detect variation. The variants we find have the potential to improve the diagnosis and treatment of human disease and also to define our unique traits. Although slower to catch up, we are now seeing equally rapid advances in the strategies used to interpret these variants in both coding and non-coding regions. Setting up a robust infrastructure, in terms of sequencing technology, pipelines for detection of all clinically significant variation, and analysis tools that incorporate the most effective approaches to variant interpretation, will be critical in delivering widespread and meaningful advances in patient care and in ensuring the accurate and informative application of genomic technology to healthcare. Many platforms have been developed to detect different types of DNA variants in the germline and in the context of somatic cancer and mosaicism. For example, short read, next generation sequencing is routinely employed to detect short sequence variants, whereas Sanger sequencing is still used to confirm many variants. Karyotyping and chromosomal microarrays are platforms that are commonly used to detect structural variants. In addition, a myriad of other platforms and assays are used to detect partial gene deletions and duplications, common translocations, repeat expansions, and gene amplifications and to discern variation in homologous regions. Yet, maintaining these many platforms to detect the multitude of human variation is complex, costly, and difficult for laboratories, clinicians, and patients to navigate. In this special issue of Genome Medicine, Lindstrand and colleagues [1] demonstrate the ability of whole genome sequencing to consolidate many of these platforms into a single approach for detecting a wide range of human variation types. The next step will be to democratize the computational tools needed to identify and annotate the different types of variation accurately, so that every laboratory that can generate a whole human genome sequence will be capable of highly sensitive and specific detection of all types of human genomic variation that have clinical consequences. Although the detection of human genetic variation is a necessary first step, many resources are needed to support the accurate interpretation of the identified variation. The human population is genetically diverse, both in the spectrum of benign variation and in variation implicated in disease. In this issue, Abul-Husn and colleagues [2] report an increased rate of variants of uncertain significance in non-European populations compared to European ones, particularly in populations with a higher proportion of African ancestry. This burden of variants of uncertain significance results from a lack of recruitment from underrepresented populations, which has created a paucity of knowledge of disease causality in these populations. Diverse cohorts of affected individuals in disease studies are therefore needed to build knowledge of genetic disease etiologies across all populations and to ensure equitable benefit to all individuals from genomic medicine. The findings reported by Abul-Husn and colleagues [2] also highlight how large and diverse catalogs of human genetic variation across geographical populations are critical for ruling out the possibility that variants that are rare in one population but commonly observed in another are disease causing. Also critical for variant interpretation are rigorous approaches for assessing the diversity of functional assays that are used to discern which variants disrupt the function of a gene product and which do not. This task is difficult because most gene products have a plethora of functions, sometimes in diverse cell types or even in an organismal context. In this special issue, Brnich and colleagues [3] propose a rigorous strategy to ensure that functional assays are well-validated before the data they generate are applied to routine clinical interpretation of variants. These recommendations have been developed for the evaluation and application of functional evidence within the ACMG/AMP variant interpretation framework [4], and are a key step forward in reducing discordance in the application of evidence codes. Furthermore, once a functional assay has been validated, it can be multiplexed to enable comprehensive assessment of the effects of one or more classes of variation, thereby enabling streamlined and accurate genetic interpretation. Multiplexed functional assays are particularly useful for assessing classes of variation that are difficult to interpret, such as missense and splice site variation. Although promising, multiplexed functional assays present a set of unique challenges for both the researchers that develop them and the clinicians using the functional data they produce. Thus, Gelman and colleagues [5] make recommendations for how the developers of multiplexed functional assays should evaluate assay performance and report assay results. They also provide guidance to clinicians on how the quality and clinical utility of large-scale functional datasets can be evaluated, and on how these data can be incorporated into routine variant interpretation. Traditional approaches to identify the genetic causes of rare disease continue to yield novel gene discoveries, including aggregating cases with extremely rare, highly penetrant phenotypes that share common disrupted candidate genes. Nevertheless, other human diseases have been harder to tackle because they are defined by nonspecific phenotypes or because they arise from variants at multiple loci. Examples include autism and congenital heart disease. However, with the ability to sequence both disease and control cohorts of individuals at scale, including trios that enable the detection of de novo variation, statistical frameworks are now able to highlight candidate disease loci with increasing precision. Lal and colleagues describe combined de novo burden analysis with grouping of paralogous genes to enable the identification of 28 strong candidate genes for neurodevelopmental disorders. Notably, these candidates are expressed in the brain and exhibit evolutionary constraint [6]. Another challenge is the interpretation of balanced structural variation, where possible drivers of pathogenicity are difficult to identify. Using a combination of experimental and computational approaches examining both direct disruption and indirect, chromatin-mediated effects, Middelkamp and colleagues [7] prioritized causal genes for previously uninterpretable de novo structural variants that were identified in the context of congenital abnormality or intellectual disability. In summary, the large scale aggregation of well-phenotyped individuals with diseases, through data sharing programs and the application of innovative methods of analysis, we will eventually build a comprehensive understanding of the genes and genomic regions that contribute to human disease. The interpretation of rare disease genetic variation has been hugely aided by systematic guidance [4] and by the routine sharing of variant interpretations in ClinVar. More recently, guidelines have been released to provide initial guidance for the interpretation of somatic variants, taking into account the added complexity of multiple dimensions of clinical relevance, including diagnosis, prognosis, and drug responsiveness [8]. These guidelines have better enabled the cancer community to standardize cancer variant assessment and to build shared community resources. These improvements are critical because they can empower the rapidly growing application of genetic testing in cancers, the results of which are critical to accurate prognosis and treatment guidance. In this issue, Lever and colleagues [9] demonstrate a text-mining approach to gather data from the literature on thousands of biomarkers and to deposit the information in a publicly accessible database called CIViCmine. He and colleagues [10] apply computational approaches to consume pre-annotated files and to apply criteria for clinical assessment. Both approaches enable the prioritization of variants identified in tumors for further review. Furthermore, Danos and colleagues [11] describe improvements to CIViC, which is an open platform for community curation of somatic variation. These improvements, which include common data models and standard operating procedures, are designed to support consistent and accurate interpretation of variants in cancer. As genomic medicine success stories continue to appear, we will confront an ever-growing number of genomes to analyze and genetic variants to interpret. Both tasks are difficult because of the complexity of the human genome and its diversity of variants, as well as the challenge of amassing sufficient data to interpret variants. This special issue describes some of the advances in variant detection, scaling of experiments, improvements in computational approaches, and construction of community resources that are helping to confront these challenges. Although this progress is promising, more work is needed. For example, we must develop an inexpensive, widely deployed pipeline for assembling whole genome sequences and detecting variants. We must apply such a pipeline to diverse human populations, at scale, in order to understand the true extent of common genetic variation. We must deploy multiplexed functional assays to quantify the effect of variation at many, if not most, disease-associated loci. Finally, we must unite these resources by adopting a coherent set of standards and a rigorous culture of data sharing. If successful, we will enable all individuals to benefit from the routine application of genomics to both disease diagnosis and genome-enabled disease prevention. 1. Lindstrand A, Eisfeldt J, Pettersson M, Carvalho CMB, Kvarnung M, Grigelioniene G, et al. From cytogenetics to cytogenomics: whole-genome sequencing as a first-line test comprehensively captures the diverse spectrum of disease-causing genetic variation underlying intellectual disability. Genome Med. 2019;11:68. Article Google Scholar 2. Abul-Husn NS, Soper ER, Odgis JA, Cullina S, Bobo D, Moscati A, et al. Exome sequencing reveals a high prevalence of BRCA1 and BRCA2 founder variants in a diverse population-based biobank. Genome Med. 2019. https://doi.org/10.1186/s13073-019-0691-1 3. Brnich SA, Abou Tayoun AN, Couch FJ, Cutting G, Greenblatt MS, Heinen CD, et al. Recommendations for application of the functional evidence PS3/BS3 criterion using the ACMG/AMP sequence variant interpretation framework. Genome Med. 2019. https://doi.org/10.1186/s13073-019-0690-2 4. Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17:405–24. Article Google Scholar 5. Gelman et al. Recommendations for the collection and use of multiplexed functional data for clinical variant interpretation. Genome Med. 2019. https://doi.org/10.1186/s13073-019-0698-7 6. Lal D, May P, Samocha KE, Kosmicki JA, Robinson EB, MØller RS, et al. Gene family information facilitates variant interpretation and identification of disease-associated genes. bioRxiv 159780; https://doi.org/10.1101/159780 7. Middelkamp S, Vlaar JM, Giltay J, Korzelius J, Besselink N, Boymans S, et al. Prioritization of genes driving congenital phenotypes of patients with de novo structural variants. Genome Med. 2019. https://doi.org/10.1186/s13073-019-0692-0 8. Li MM, Datto M, Duncavage EJ, Kulkarni S, Lindeman NI, Roy S, et al. Standards and guidelines for the interpretation and reporting of sequence variants in cancer: a joint consensus recommendation of the Association for Molecular Pathology, American Society of Clinical Oncology, and College of American Pathologists. J Mol Diagn. 2017;19:4–23. CAS Article Google Scholar 9. Lever J, Jones MR, Danos AM, Krysiak K, Bonakdar M, Grewal J, et al. Text-mining clinically relevant cancer biomarkers for curation into the CIViC database. Genome Med. 2019. https://doi.org/10.1186/s13073-019-0686-y 10. He MM, Li Q, Yan M, Cao H, Hu Y, He KY, et al. Variant interpretation for Cancer (VIC): a computational tool for assessing clinical impacts of somatic variants. Genome Med. 2019;11:53. Article Google Scholar 11. Danos et al. Standard operating procedure for curation and clinical interpretation of variants in cancer. Genome Med. 2019;11:76. https://doi.org/10.1186/s13073-019-0687-x Download references We thank all of the authors who submitted manuscripts for this special issue of Genome Medicine. Funding HLR was supported by the National Human Genome Research Institute of the National Institutes of Health (NIH) under award numbers UM1HG008900, U01HG008676, and U41HG006834. DMF was supported by the National Human Genome Research Institute of the NIH under award number RM1HG010461. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. Affiliations Center for Genomic Medicine, Massachusetts General Hospital, Cambridge Street, Boston, MA, 02114, USA Heidi L. Rehm Medical and Population Genetics, Broad Institute of MIT and Harvard, Main Street, Cambridge, MA, 02142, USA Heidi L. Rehm Department of Pathology, Harvard Medical School, Shattuck Street, Boston, MA, 02115, USA Heidi L. Rehm Department of Genome Sciences, University of Washington, 15th Avenue NE, Seattle, WA, 98195-5065, USA Douglas M. Fowler Canadian Institute for Advanced Research, University Avenue, Toronto, ON, M5G 1M1, Canada Douglas M. Fowler Department of Bioengineering, University of Washington, 15th Avenue NE, Seattle, WA, 98195-5061, USA Douglas M. FowlerAuthors Search for Heidi L. Rehm in: PubMed • Google Scholar Search for Douglas M. Fowler in: PubMed • Google Scholar Contributions Both authors drafted and edited the manuscript and also approved the final version. Corresponding authors Correspondence to Heidi L. Rehm or Douglas M. Fowler. Competing interests The authors declare that they have no competing interests. Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Reprints and Permissions Cite this article Rehm, H.L., Fowler, D.M. Keeping up with the genomes: scaling genomic variant interpretation. Genome Med 12, 5 (2020) doi:10.1186/s13073-019-0700-4 Download citation Published 31 December 2019 DOI https://doi.org/10.1186/s13073-019-0700-4

  • Neoantigen-specific immunity in low mutation burden colorectal cancers of the consensus molecular subtype 4
    Genome Med. (IF 10.886) Pub Date : 2019-12-30
    Jitske van den Bulk; Els M. E. Verdegaal; Dina Ruano; Marieke E. Ijsselsteijn; Marten Visser; Ruud van der Breggen; Thomas Duhen; Manon van der Ploeg; Natasja L. de Vries; Jan Oosting; Koen C. M. J. Peeters; Andrew D. Weinberg; Arantza Farina-Sarasqueta; Sjoerd H. van der Burg; Noel F. C. C. de Miranda

    The efficacy of checkpoint blockade immunotherapies in colorectal cancer is currently restricted to a minority of patients diagnosed with mismatch repair-deficient tumors having high mutation burden. However, this observation does not exclude the existence of neoantigen-specific T cells in colorectal cancers with low mutation burden and the exploitation of their anti-cancer potential for immunotherapy. Therefore, we investigated whether autologous neoantigen-specific T cell responses could also be observed in patients diagnosed with mismatch repair-proficient colorectal cancers. Whole-exome and transcriptome sequencing were performed on cancer and normal tissues from seven colorectal cancer patients diagnosed with mismatch repair-proficient tumors to detect putative neoantigens. Corresponding neo-epitopes were synthesized and tested for recognition by in vitro expanded T cells that were isolated from tumor tissues (tumor-infiltrating lymphocytes) and from peripheral mononuclear blood cells stimulated with tumor material. Neoantigen-specific T cell reactivity was detected to several neo-epitopes in the tumor-infiltrating lymphocytes of three patients while their respective cancers expressed 15, 21, and 30 non-synonymous variants. Cell sorting of tumor-infiltrating lymphocytes based on the co-expression of CD39 and CD103 pinpointed the presence of neoantigen-specific T cells in the CD39+CD103+ T cell subset. Strikingly, the tumors containing neoantigen-reactive TIL were classified as consensus molecular subtype 4 (CMS4), which is associated with TGF-β pathway activation and worse clinical outcome. We have detected neoantigen-targeted reactivity by autologous T cells in mismatch repair-proficient colorectal cancers of the CMS4 subtype. These findings warrant the development of specific immunotherapeutic strategies that selectively boost the activity of neoantigen-specific T cells and target the TGF-β pathway to reinforce T cell reactivity in this patient group.

  • Epigenetic therapy of myelodysplastic syndromes connects to cellular differentiation independently of endogenous retroelement derepression
    Genome Med. (IF 10.886) Pub Date : 2019-12-23
    Anastasiya Kazachenka; George R. Young; Jan Attig; Chrysoula Kordella; Eleftheria Lamprianidou; Emmanuela Zoulia; George Vrachiolias; Menelaos Papoutselis; Elsa Bernard; Elli Papaemmanuil; Ioannis Kotsianidis; George Kassiotis

    Myelodysplastic syndromes (MDS) and acute myeloid leukaemia (AML) are characterised by abnormal epigenetic repression and differentiation of bone marrow haematopoietic stem cells (HSCs). Drugs that reverse epigenetic repression, such as 5-azacytidine (5-AZA), induce haematological improvement in half of treated patients. Although the mechanisms underlying therapy success are not yet clear, induction of endogenous retroelements (EREs) has been hypothesised. Using RNA sequencing (RNA-seq), we compared the transcription of EREs in bone marrow HSCs from a new cohort of MDS and chronic myelomonocytic leukaemia (CMML) patients before and after 5-AZA treatment with HSCs from healthy donors and AML patients. We further examined ERE transcription using the most comprehensive annotation of ERE-overlapping transcripts expressed in HSCs, generated here by de novo transcript assembly and supported by full-length RNA-seq. Consistent with prior reports, we found that treatment with 5-AZA increased the representation of ERE-derived RNA-seq reads in the transcriptome. However, such increases were comparable between treatment responses and failures. The extended view of HSC transcriptional diversity offered by de novo transcript assembly argued against 5-AZA-responsive EREs as determinants of the outcome of therapy. Instead, it uncovered pre-treatment expression and alternative splicing of developmentally regulated gene transcripts as predictors of the response of MDS and CMML patients to 5-AZA treatment. Our study identifies the developmentally regulated transcriptional signatures of protein-coding and non-coding genes, rather than EREs, as correlates of a favourable response of MDS and CMML patients to 5-AZA treatment and offers novel candidates for further evaluation.

  • Recommendations for the collection and use of multiplexed functional data for clinical variant interpretation
    Genome Med. (IF 10.886) Pub Date : 2019-12-20
    Hannah Gelman; Jennifer N. Dines; Jonathan Berg; Alice H. Berger; Sarah Brnich; Fuki M. Hisama; Richard G. James; Alan F. Rubin; Jay Shendure; Brian Shirts; Douglas M. Fowler; Lea M. Starita

    Variants of uncertain significance represent a massive challenge to medical genetics. Multiplexed functional assays, in which the functional effects of thousands of genomic variants are assessed simultaneously, are increasingly generating data that can be used as additional evidence for or against variant pathogenicity. Such assays have the potential to resolve variants of uncertain significance, thereby increasing the clinical utility of genomic testing. Existing standards from the American College of Medical Genetics and Genomics (ACMG)/Association for Molecular Pathology (AMP) and new guidelines from the Clinical Genome Resource (ClinGen) establish the role of functional data in variant interpretation, but do not address the specific challenges or advantages of using functional data derived from multiplexed assays. Here, we build on these existing guidelines to provide recommendations to experimentalists for the production and reporting of multiplexed functional data and to clinicians for the evaluation and use of such data. By following these recommendations, experimentalists can produce transparent, complete, and well-validated datasets that are primed for clinical uptake. Our recommendations to clinicians and diagnostic labs on how to evaluate the quality of multiplexed functional datasets, and how different datasets could be incorporated into the ACMG/AMP variant-interpretation framework, will hopefully clarify whether and how such data should be used. The recommendations that we provide are designed to enhance the quality and utility of multiplexed functional data, and to promote their judicious use.

  • FIREVAT: finding reliable variants without artifacts in human cancer samples using etiologically relevant mutational signatures
    Genome Med. (IF 10.886) Pub Date : 2019-12-17
    Hyunbin Kim; Andy Jinseok Lee; Jongkeun Lee; Hyonho Chun; Young Seok Ju; Dongwan Hong

    Accurate identification of real somatic variants is a primary part of cancer genome studies and precision oncology. However, artifacts introduced in various steps of sequencing obfuscate confidence in variant calling. Current computational approaches to variant filtering involve intensive interrogation of Binary Alignment Map (BAM) files and require massive computing power, data storage, and manual labor. Recently, mutational signatures associated with sequencing artifacts have been extracted by the Pan-cancer Analysis of Whole Genomes (PCAWG) study. These spectrums can be used to evaluate refinement quality of a given set of somatic mutations. Here we introduce a novel variant refinement software, FIREVAT (FInding REliable Variants without ArTifacts), which uses known spectrums of sequencing artifacts extracted from one of the largest publicly available catalogs of human tumor samples. FIREVAT performs a quick and efficient variant refinement that accurately removes artifacts and greatly improves the precision and specificity of somatic calls. We validated FIREVAT refinement performance using orthogonal sequencing datasets totaling 384 tumor samples with respect to ground truth. Our novel method achieved the highest level of performance compared to existing filtering approaches. Application of FIREVAT on additional 308 The Cancer Genome Atlas (TCGA) samples demonstrated that FIREVAT refinement leads to identification of more biologically and clinically relevant mutational signatures as well as enrichment of sequence contexts associated with experimental errors. FIREVAT only requires a Variant Call Format file (VCF) and generates a comprehensive report of the variant refinement processes and outcomes for the user. In summary, FIREVAT facilitates a novel refinement strategy using mutational signatures to distinguish artifactual point mutations called in human cancer samples. We anticipate that FIREVAT results will further contribute to precision oncology efforts that rely on accurate identification of variants, especially in the context of analyzing mutational signatures that bear prognostic and therapeutic significance. FIREVAT is freely available at https://github.com/cgab-ncc/FIREVAT

  • Genomics of circadian rhythms in health and disease
    Genome Med. (IF 10.886) Pub Date : 2019-12-17
    Filipa Rijo-Ferreira; Joseph S. Takahashi

    Circadian clocks are endogenous oscillators that control 24-h physiological and behavioral processes. The central circadian clock exerts control over myriad aspects of mammalian physiology, including the regulation of sleep, metabolism, and the immune system. Here, we review advances in understanding the genetic regulation of sleep through the circadian system, as well as the impact of dysregulated gene expression on metabolic function. We also review recent studies that have begun to unravel the circadian clock’s role in controlling the cardiovascular and nervous systems, gut microbiota, cancer, and aging. Such circadian control of these systems relies, in part, on transcriptional regulation, with recent evidence for genome-wide regulation of the clock through circadian chromosome organization. These novel insights into the genomic regulation of human physiology provide opportunities for the discovery of improved treatment strategies and new understanding of the biological underpinnings of human disease.

  • Re-analysis of whole-exome sequencing data uncovers novel diagnostic variants and improves molecular diagnostic yields for sudden death and idiopathic diseases
    Genome Med. (IF 10.886) Pub Date : 2019-12-17
    Elias L. Salfati; Emily G. Spencer; Sarah E. Topol; Evan D. Muse; Manuel Rueda; Jonathan R. Lucas; Glenn N. Wagner; Steven Campman; Eric J. Topol; Ali Torkamani

    Whole-exome sequencing (WES) has become an efficient diagnostic test for patients with likely monogenic conditions such as rare idiopathic diseases or sudden unexplained death. Yet, many cases remain undiagnosed. Here, we report the added diagnostic yield achieved for 101 WES cases re-analyzed 1 to 7 years after initial analysis. Of the 101 WES cases, 51 were rare idiopathic disease cases and 50 were postmortem “molecular autopsy” cases of early sudden unexplained death. Variants considered for reporting were prioritized and classified into three groups: (1) diagnostic variants, pathogenic and likely pathogenic variants in genes known to cause the phenotype of interest; (2) possibly diagnostic variants, possibly pathogenic variants in genes known to cause the phenotype of interest or pathogenic variants in genes possibly causing the phenotype of interest; and (3) variants of uncertain diagnostic significance, potentially deleterious variants in genes possibly causing the phenotype of interest. Initial analysis revealed diagnostic variants in 13 rare disease cases (25.4%) and 5 sudden death cases (10%). Re-analysis resulted in the identification of additional diagnostic variants in 3 rare disease cases (5.9%) and 1 sudden unexplained death case (2%), which increased our molecular diagnostic yield to 31.4% and 12%, respectively. The basis of new findings ranged from improvement in variant classification tools, updated genetic databases, and updated clinical phenotypes. Our findings highlight the potential for re-analysis to reveal diagnostic variants in cases that remain undiagnosed after initial WES.

  • A KHDC3L mutation resulting in recurrent hydatidiform mole causes genome-wide DNA methylation loss in oocytes and persistent imprinting defects post-fertilisation
    Genome Med. (IF 10.886) Pub Date : 2019-12-17
    Hannah Demond; Zahra Anvar; Bahia Namavar Jahromi; Angela Sparago; Ankit Verma; Maryam Davari; Luciano Calzari; Silvia Russo; Mojgan Akbarzadeh Jahromi; David Monk; Simon Andrews; Andrea Riccio; Gavin Kelsey

    Maternal effect mutations in the components of the subcortical maternal complex (SCMC) of the human oocyte can cause early embryonic failure, gestational abnormalities and recurrent pregnancy loss. Enigmatically, they are also associated with DNA methylation abnormalities at imprinted genes in conceptuses: in the devastating gestational abnormality biparental complete hydatidiform mole (BiCHM) or in multi-locus imprinting disease (MLID). However, the developmental timing, genomic extent and mechanistic basis of these imprinting defects are unknown. The rarity of these disorders and the possibility that methylation defects originate in oocytes have made these questions very challenging to address. Single-cell bisulphite sequencing (scBS-seq) was used to assess methylation in oocytes from a patient with BiCHM identified to be homozygous for an inactivating mutation in the human SCMC component KHDC3L. Genome-wide methylation analysis of a preimplantation embryo and molar tissue from the same patient was also performed. High-coverage scBS-seq libraries were obtained from five KHDC3Lc.1A>G oocytes, which revealed a genome-wide deficit of DNA methylation compared with normal human oocytes. Importantly, germline differentially methylated regions (gDMRs) of imprinted genes were affected similarly to other sequence features that normally become methylated in oocytes, indicating no selectivity towards imprinted genes. A range of methylation losses was observed across genomic features, including gDMRs, indicating variable sensitivity to defects in the SCMC. Genome-wide analysis of a pre-implantation embryo and molar tissue from the same patient showed that following fertilisation methylation defects at imprinted genes persist, while most non-imprinted regions of the genome recover near-normal methylation post-implantation. We show for the first time that the integrity of the SCMC is essential for de novo methylation in the female germline. These findings have important implications for understanding the role of the SCMC in DNA methylation and for the origin of imprinting defects, for counselling affected families, and will help inform future therapeutic approaches.

  • Distinct patterns of complex rearrangements and a mutational signature of microhomeology are frequently observed in PLP1 copy number gain structural variants
    Genome Med. (IF 10.886) Pub Date : 2019-12-09
    Vahid Bahrambeigi; Xiaofei Song; Karen Sperle; Christine R. Beck; Hadia Hijazi; Christopher M. Grochowski; Shen Gu; Pavel Seeman; Karen J. Woodward; Claudia M. B. Carvalho; Grace M. Hobson; James R. Lupski

    We investigated the features of the genomic rearrangements in a cohort of 50 male individuals with proteolipid protein 1 (PLP1) copy number gain events who were ascertained with Pelizaeus-Merzbacher disease (PMD; MIM: 312080). We then compared our new data to previous structural variant mutagenesis studies involving the Xq22 region of the human genome. The aggregate data from 159 sequenced join-points (discontinuous sequences in the reference genome that are joined during the rearrangement process) were studied. Analysis of these data from 150 individuals enabled the spectrum and relative distribution of the underlying genomic mutational signatures to be delineated. Genomic rearrangements in PMD individuals with PLP1 copy number gain events were investigated by high-density customized array or clinical chromosomal microarray analysis and breakpoint junction sequence analysis. High-density customized array showed that the majority of cases (33/50; ~ 66%) present with single duplications, although complex genomic rearrangements (CGRs) are also frequent (17/50; ~ 34%). Breakpoint mapping to nucleotide resolution revealed further previously unknown structural and sequence complexities, even in single duplications. Meta-analysis of all studied rearrangements that occur at the PLP1 locus showed that single duplications were found in ~ 54% of individuals and that, among all CGR cases, triplication flanked by duplications is the most frequent CGR array CGH pattern observed. Importantly, in ~ 32% of join-points, there is evidence for a mutational signature of microhomeology (highly similar yet imperfect sequence matches). These data reveal a high frequency of CGRs at the PLP1 locus and support the assertion that replication-based mechanisms are prominent contributors to the formation of CGRs at Xq22. We propose that microhomeology can facilitate template switching, by stabilizing strand annealing of the primer using W-C base complementarity, and is a mutational signature for replicative repair.

  • Prioritization of genes driving congenital phenotypes of patients with de novo genomic structural variants
    Genome Med. (IF 10.886) Pub Date : 2019-12-04
    Sjors Middelkamp; Judith M. Vlaar; Jacques Giltay; Jerome Korzelius; Nicolle Besselink; Sander Boymans; Roel Janssen; Lisanne de la Fonteijne; Ellen van Binsbergen; Markus J. van Roosmalen; Ron Hochstenbach; Daniela Giachino; Michael E. Talkowski; Wigard P. Kloosterman; Edwin Cuppen

    Genomic structural variants (SVs) can affect many genes and regulatory elements. Therefore, the molecular mechanisms driving the phenotypes of patients carrying de novo SVs are frequently unknown. We applied a combination of systematic experimental and bioinformatic methods to improve the molecular diagnosis of 39 patients with multiple congenital abnormalities and/or intellectual disability harboring apparent de novo SVs, most with an inconclusive diagnosis after regular genetic testing. In 7 of these cases (18%), whole-genome sequencing analysis revealed disease-relevant complexities of the SVs missed in routine microarray-based analyses. We developed a computational tool to predict the effects on genes directly affected by SVs and on genes indirectly affected likely due to the changes in chromatin organization and impact on regulatory mechanisms. By combining these functional predictions with extensive phenotype information, candidate driver genes were identified in 16/39 (41%) patients. In 8 cases, evidence was found for the involvement of multiple candidate drivers contributing to different parts of the phenotypes. Subsequently, we applied this computational method to two cohorts containing a total of 379 patients with previously detected and classified de novo SVs and identified candidate driver genes in 189 cases (50%), including 40 cases whose SVs were previously not classified as pathogenic. Pathogenic position effects were predicted in 28% of all studied cases with balanced SVs and in 11% of the cases with copy number variants. These results demonstrate an integrated computational and experimental approach to predict driver genes based on analyses of WGS data with phenotype association and chromatin organization datasets. These analyses nominate new pathogenic loci and have strong potential to improve the molecular diagnosis of patients with de novo SVs.

  • Text-mining clinically relevant cancer biomarkers for curation into the CIViC database
    Genome Med. (IF 10.886) Pub Date : 2019-12-03
    Jake Lever; Martin R. Jones; Arpad M. Danos; Kilannin Krysiak; Melika Bonakdar; Jasleen K. Grewal; Luka Culibrk; Obi L. Griffith; Malachi Griffith; Steven J. M. Jones

    Precision oncology involves analysis of individual cancer samples to understand the genes and pathways involved in the development and progression of a cancer. To improve patient care, knowledge of diagnostic, prognostic, predisposing, and drug response markers is essential. Several knowledgebases have been created by different groups to collate evidence for these associations. These include the open-access Clinical Interpretation of Variants in Cancer (CIViC) knowledgebase. These databases rely on time-consuming manual curation from skilled experts who read and interpret the relevant biomedical literature. To aid in this curation and provide the greatest coverage for these databases, particularly CIViC, we propose the use of text mining approaches to extract these clinically relevant biomarkers from all available published literature. To this end, a group of cancer genomics experts annotated sentences that discussed biomarkers with their clinical associations and achieved good inter-annotator agreement. We then used a supervised learning approach to construct the CIViCmine knowledgebase. We extracted 121,589 relevant sentences from PubMed abstracts and PubMed Central Open Access full-text papers. CIViCmine contains over 87,412 biomarkers associated with 8035 genes, 337 drugs, and 572 cancer types, representing 25,818 abstracts and 39,795 full-text publications. Through integration with CIVIC, we provide a prioritized list of curatable clinically relevant cancer biomarkers as well as a resource that is valuable to other knowledgebases and precision cancer analysts in general. All data is publically available and distributed with a Creative Commons Zero license. The CIViCmine knowledgebase is available at http://bionlp.bcgsc.ca/civicmine/.

  • Comparative analysis of functional assay evidence use by ClinGen Variant Curation Expert Panels
    Genome Med. (IF 10.886) Pub Date : 2019-11-29
    Dona M. Kanavy; Shannon M. McNulty; Meera K. Jairath; Sarah E. Brnich; Chris Bizon; Bradford C. Powell; Jonathan S. Berg

    The 2015 American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) guidelines for clinical sequence variant interpretation state that “well-established” functional studies can be used as evidence in variant classification. These guidelines articulated key attributes of functional data, including that assays should reflect the biological environment and be analytically sound; however, details of how to evaluate these attributes were left to expert judgment. The Clinical Genome Resource (ClinGen) designates Variant Curation Expert Panels (VCEPs) in specific disease areas to make gene-centric specifications to the ACMG/AMP guidelines, including more specific definitions of appropriate functional assays. We set out to evaluate the existing VCEP guidelines for functional assays. We evaluated the functional criteria (PS3/BS3) of six VCEPs (CDH1, Hearing Loss, Inherited Cardiomyopathy-MYH7, PAH, PTEN, RASopathy). We then established criteria for evaluating functional studies based on disease mechanism, general class of assay, and the characteristics of specific assay instances described in the primary literature. Using these criteria, we extensively curated assay instances cited by each VCEP in their pilot variant classification to analyze VCEP recommendations and their use in the interpretation of functional studies. Unsurprisingly, our analysis highlighted the breadth of VCEP-approved assays, reflecting the diversity of disease mechanisms among VCEPs. We also noted substantial variability between VCEPs in the method used to select these assays and in the approach used to specify strength modifications, as well as differences in suggested validation parameters. Importantly, we observed discrepancies between the parameters VCEPs specified as required for approved assay instances and the fulfillment of these requirements in the individual assays cited in pilot variant interpretation. Interpretation of the intricacies of functional assays often requires expert-level knowledge of the gene and disease, and current VCEP recommendations for functional assay evidence are a useful tool to improve the accessibility of functional data by providing a starting point for curators to identify approved functional assays and key metrics. However, our analysis suggests that further guidance is needed to standardize this process and ensure consistency in the application of functional evidence.

  • Standard operating procedure for curation and clinical interpretation of variants in cancer
    Genome Med. (IF 10.886) Pub Date : 2019-11-29
    Arpad M. Danos; Kilannin Krysiak; Erica K. Barnell; Adam C. Coffman; Joshua F. McMichael; Susanna Kiwala; Nicholas C. Spies; Lana M. Sheta; Shahil P. Pema; Lynzey Kujan; Kaitlin A. Clark; Amber Z. Wollam; Shruti Rao; Deborah I. Ritter; Dmitriy Sonkin; Gordana Raca; Wan-Hsin Lin; Cameron J. Grisdale; Raymond H. Kim; Alex H. Wagner; Subha Madhavan; Malachi Griffith; Obi L. Griffith

    Manually curated variant knowledgebases and their associated knowledge models are serving an increasingly important role in distributing and interpreting variants in cancer. These knowledgebases vary in their level of public accessibility, and the complexity of the models used to capture clinical knowledge. CIViC (Clinical Interpretation of Variants in Cancer - www.civicdb.org) is a fully open, free-to-use cancer variant interpretation knowledgebase that incorporates highly detailed curation of evidence obtained from peer-reviewed publications and meeting abstracts, and currently holds over 6300 Evidence Items for over 2300 variants derived from over 400 genes. CIViC has seen increased adoption by, and also undertaken collaboration with, a wide range of users and organizations involved in research. To enhance CIViC’s clinical value, regular submission to the ClinVar database and pursuit of other regulatory approvals is necessary. For this reason, a formal peer reviewed curation guideline and discussion of the underlying principles of curation is needed. We present here the CIViC knowledge model, standard operating procedures (SOP) for variant curation, and detailed examples to support community-driven curation of cancer variants.

  • Genomic screening and genomic diagnostic testing—two very different kettles of fish
    Genome Med. (IF 10.886) Pub Date : 2019-11-27
    Leslie G. Biesecker

    /n /n /n /n /n /n /n /n /n /n

  • Immune receptor repertoires in pediatric and adult acute myeloid leukemia
    Genome Med. (IF 10.886) Pub Date : 2019-11-26
    Jian Zhang; Xihao Hu; Jin Wang; Avinash Das Sahu; David Cohen; Li Song; Zhangyi Ouyang; Jingyu Fan; Binbin Wang; Jingxin Fu; Shengqing Gu; Moshe Sade-Feldman; Nir Hacohen; Wuju Li; Xiaomin Ying; Bo Li; X. Shirley Liu

    Acute myeloid leukemia (AML), caused by the abnormal proliferation of immature myeloid cells in the blood or bone marrow, is one of the most common hematologic malignancies. Currently, the interactions between malignant myeloid cells and the immune microenvironment, especially T cells and B cells, remain poorly characterized. In this study, we systematically analyzed the T cell receptor and B cell receptor (TCR and BCR) repertoires from the RNA-seq data of 145 pediatric and 151 adult AML samples as well as 73 non-tumor peripheral blood samples. We inferred over 225,000 complementarity-determining region 3 (CDR3) sequences in TCR α, β, γ, and δ chains and 1,210,000 CDR3 sequences in B cell immunoglobulin (Ig) heavy and light chains. We found higher clonal expansion of both T cells and B cells in the AML microenvironment and observed many differences between pediatric and adult AML. Most notably, adult AML samples have significantly higher level of B cell activation and more secondary Ig class switch events than pediatric AML or non-tumor samples. Furthermore, adult AML with highly expanded IgA2 B cells, which might represent an immunosuppressive microenvironment, are associated with regulatory T cells and worse overall survival. Our comprehensive characterization of the AML immune receptor repertoires improved our understanding of T cell and B cell immunity in AML, which may provide insights into immunotherapies in hematological malignancies.

  • Low coverage whole genome sequencing enables accurate assessment of common variants and calculation of genome-wide polygenic scores
    Genome Med. (IF 10.886) Pub Date : 2019-11-26
    Julian R. Homburger; Cynthia L. Neben; Gilad Mishne; Alicia Y. Zhou; Sekar Kathiresan; Amit V. Khera

    Inherited susceptibility to common, complex diseases may be caused by rare, pathogenic variants (“monogenic”) or by the cumulative effect of numerous common variants (“polygenic”). Comprehensive genome interpretation should enable assessment for both monogenic and polygenic components of inherited risk. The traditional approach requires two distinct genetic testing technologies—high coverage sequencing of known genes to detect monogenic variants and a genome-wide genotyping array followed by imputation to calculate genome-wide polygenic scores (GPSs). We assessed the feasibility and accuracy of using low coverage whole genome sequencing (lcWGS) as an alternative to genotyping arrays to calculate GPSs. First, we performed downsampling and imputation of WGS data from ten individuals to assess concordance with known genotypes. Second, we assessed the correlation between GPSs for 3 common diseases—coronary artery disease (CAD), breast cancer (BC), and atrial fibrillation (AF)—calculated using lcWGS and genotyping array in 184 samples. Third, we assessed concordance of lcWGS-based genotype calls and GPS calculation in 120 individuals with known genotypes, selected to reflect diverse ancestral backgrounds. Fourth, we assessed the relationship between GPSs calculated using lcWGS and disease phenotypes in a cohort of 11,502 individuals of European ancestry. We found imputation accuracy r2 values of greater than 0.90 for all ten samples—including those of African and Ashkenazi Jewish ancestry—with lcWGS data at 0.5×. GPSs calculated using lcWGS and genotyping array followed by imputation in 184 individuals were highly correlated for each of the 3 common diseases (r2 = 0.93–0.97) with similar score distributions. Using lcWGS data from 120 individuals of diverse ancestral backgrounds, we found similar results with respect to imputation accuracy and GPS correlations. Finally, we calculated GPSs for CAD, BC, and AF using lcWGS in 11,502 individuals of European ancestry, confirming odds ratios per standard deviation increment ranging 1.28 to 1.59, consistent with previous studies. lcWGS is an alternative technology to genotyping arrays for common genetic variant assessment and GPS calculation. lcWGS provides comparable imputation accuracy while also overcoming the ascertainment bias inherent to variant selection in genotyping array design.

  • Is ‘likely pathogenic’ really 90% likely? Reclassification data in ClinVar
    Genome Med. (IF 10.886) Pub Date : 2019-11-21
    Steven M. Harrison; Heidi L. Rehm

    In 2015, professional guidelines defined the term ‘likely pathogenic’ to mean with a 90% chance of pathogenicity. To determine whether current practice reflects this definition, ClinVar classifications were tracked from 2016 to 2019. During that period, between 83.8 and 99.1% of likely pathogenic classifications were reclassified as pathogenic, depending on whether LP to VUS reclassifications are included and on how these classifications are categorized.

  • Artificial intelligence in clinical and genomic diagnostics
    Genome Med. (IF 10.886) Pub Date : 2019-11-19
    Raquel Dias; Ali Torkamani

    Artificial intelligence (AI) is the development of computer systems that are able to perform tasks that normally require human intelligence. Advances in AI software and hardware, especially deep learning algorithms and the graphics processing units (GPUs) that power their training, have led to a recent and rapidly increasing interest in medical AI applications. In clinical diagnostics, AI-based computer vision approaches are poised to revolutionize image-based diagnostics, while other AI subtypes have begun to show similar promise in various diagnostic modalities. In some areas, such as clinical genomics, a specific type of AI algorithm known as deep learning is used to process large and complex genomic datasets. In this review, we first summarize the main classes of problems that AI systems are well suited to solve and describe the clinical diagnostic tasks that benefit from these solutions. Next, we focus on emerging methods for specific tasks in clinical genomics, including variant calling, genome annotation and variant classification, and phenotype-to-genotype correspondence. Finally, we end with a discussion on the future potential of AI in individualized medicine applications, especially for risk prediction in common complex diseases, and the challenges, limitations, and biases that must be carefully addressed for the successful deployment of AI in medical applications, particularly those utilizing human genetics and genomics data.

  • Neoantigens and genome instability: impact on immunogenomic phenotypes and immunotherapy response
    Genome Med. (IF 10.886) Pub Date : 2019-11-20
    Elaine R. Mardis

    The resurgence of immune therapies in cancer medicine has elicited a corresponding interest in understanding the basis of patient response or resistance to these treatments. One aspect of patient response clearly lies in the genomic alterations that are associated with cancer onset and progression, including those that contribute to genomic instability and the resulting creation of novel peptide sequences that may present as neoantigens. The immune reaction to these unique ‘non-self’ peptides is frequently suppressed by the tumor itself, but the use of checkpoint blockade therapies, personalized vaccines, or a combination of these treatments may elicit a tumor-specific immune response that results in cell death. Massively parallel sequencing, coupled with different computational analyses, provides unbiased identification of the germline and somatic alterations that drive cancer development, and of those alterations that lead to neoantigens. These range from simple point mutations that change single amino acids to complex alterations, such as frameshift insertion or deletion mutations, splice-site alterations that lead to exon skipping, structural alterations that lead to the formation of fusion proteins, and other forms of collateral damage caused by genome instability that result in new protein sequences unique to the cancer. The various genome instability phenotypes can be identified as alterations that impact DNA replication or mismatch repair pathways or by their genomic signatures. This review provides an overview of current knowledge regarding the fundamentals of genome replication and of both germline and somatic alterations that disrupt normal replication, leading to various forms of genomic instability in cancers, to the resulting generation of neoantigens and, ultimately, to immune-responsive and resistant phenotypes.

  • Translating insights into tumor evolution to clinical practice: promises and challenges
    Genome Med. (IF 10.886) Pub Date : 2019-03-29
    Matthew W. Fittall; Peter Van Loo

    Accelerating technological advances have allowed the widespread genomic profiling of tumors. As yet, however, the vast catalogues of mutations that have been identified have made only a modest impact on clinical medicine. Massively parallel sequencing has informed our understanding of the genetic evolution and heterogeneity of cancers, allowing us to place these mutational catalogues into a meaningful context. Here, we review the methods used to measure tumor evolution and heterogeneity, and the potential and challenges for translating the insights gained to achieve clinical impact for cancer therapy, monitoring, early detection, risk stratification, and prevention. We discuss how tumor evolution can guide cancer therapy by targeting clonal and subclonal mutations both individually and in combination. Circulating tumor DNA and circulating tumor cells can be leveraged for monitoring the efficacy of therapy and for tracking the emergence of resistant subclones. The evolutionary history of tumors can be deduced for late-stage cancers, either directly by sampling precursor lesions or by leveraging computational approaches to infer the timing of driver events. This approach can identify recurrent early driver mutations that represent promising avenues for future early detection strategies. Emerging evidence suggests that mutational processes and complex clonal dynamics are active even in normal development and aging. This will make discriminating developing malignant neoplasms from normal aging cell lineages a challenge. Furthermore, insight into signatures of mutational processes that are active early in tumor evolution may allow the development of cancer-prevention approaches. Research and clinical studies that incorporate an appreciation of the complex evolutionary patterns in tumors will not only produce more meaningful genomic data, but also better exploit the vulnerabilities of cancer, resulting in improved treatment outcomes.

  • CRISPR-SONIC: targeted somatic oncogene knock-in enables rapid in vivo cancer modeling
    Genome Med. (IF 10.886) Pub Date : 2019-04-16
    Haiwei Mou; Deniz M. Ozata; Jordan L. Smith; Ankur Sheel; Suet-Yan Kwan; Soren Hough; Alper Kucukural; Zachary Kennedy; Yueying Cao; Wen Xue

    CRISPR/Cas9 has revolutionized cancer mouse models. Although loss-of-function genetics by CRISPR/Cas9 is well-established, generating gain-of-function alleles in somatic cancer models is still challenging because of the low efficiency of gene knock-in. Here we developed CRISPR-based Somatic Oncogene kNock-In for Cancer Modeling (CRISPR-SONIC), a method for rapid in vivo cancer modeling using homology-independent repair to integrate oncogenes at a targeted genomic locus. Using a dual guide RNA strategy, we integrated a plasmid donor in the 3′-UTR of mouse β-actin, allowing co-expression of reporter genes or oncogenes from the β-actin promoter. We showed that knock-in of oncogenic Ras and loss of p53 efficiently induced intrahepatic cholangiocarcinoma in mice. Further, our strategy can generate bioluminescent liver cancer to facilitate tumor imaging. This method simplifies in vivo gain-of-function genetics by facilitating targeted integration of oncogenes.

  • Designing circulating tumor DNA-based interventional clinical trials in oncology
    Genome Med. (IF 10.886) Pub Date : 2019-04-19
    Daniel V. Araujo; Scott V. Bratman; Lillian L. Siu

    Circulating tumor (ct) DNA is a powerful tool that can be used to track cancer beyond a single snapshot in space and time. It has potential applications in detecting minimal residual disease and predicting relapse, in selecting patients for tailored treatments, and in revealing mechanisms of response or resistance. Here, we discuss the incorporation of ctDNA into clinical trials.

  • TCF21 and AP-1 interact through epigenetic modifications to regulate coronary artery disease gene expression
    Genome Med. (IF 10.886) Pub Date : 2019-04-23
    Quanyi Zhao; Robert Wirka; Trieu Nguyen; Manabu Nagao; Paul Cheng; Clint L. Miller; Juyong Brian Kim; Milos Pjanic; Thomas Quertermous

    Genome-wide association studies have identified over 160 loci that are associated with coronary artery disease. As with other complex human diseases, risk in coronary disease loci is determined primarily by altered expression of the causal gene, due to variation in binding of transcription factors and chromatin-modifying proteins that directly regulate the transcriptional apparatus. We have previously identified a coronary disease network downstream of the disease-associated transcription factor TCF21, and in work reported here extends these studies to investigate the mechanisms by which it interacts with the AP-1 transcription complex to regulate local epigenetic effects in these downstream coronary disease loci. Genomic studies, including chromatin immunoprecipitation sequencing, RNA sequencing, and protein-protein interaction studies, were performed in human coronary artery smooth muscle cells. We show here that TCF21 and JUN regulate expression of two presumptive causal coronary disease genes, SMAD3 and CDKN2B-AS1, in part by interactions with histone deacetylases and acetyltransferases. Genome-wide TCF21 and JUN binding is jointly localized and particularly enriched in coronary disease loci where they broadly modulate H3K27Ac and chromatin state changes linked to disease-related processes in vascular cells. Heterozygosity at coronary disease causal variation, or genome editing of these variants, is associated with decreased binding of both JUN and TCF21 and loss of expression in cis, supporting a transcriptional mechanism for disease risk. These data show that the known chromatin remodeling and pioneer functions of AP-1 are a pervasive aspect of epigenetic control of transcription, and thus, the risk in coronary disease-associated loci, and that interaction of AP-1 with TCF21 to control epigenetic features, contributes to the genetic risk in loci where they co-localize.

  • Molecular basis for phenotypic similarity of genetic disorders
    Genome Med. (IF 10.886) Pub Date : 2019-04-23
    Vijay Kumar Pounraja; Santhosh Girirajan

    The contribution of distinct genes to overlapping phenotypes suggests that such genes share ancestral origins, membership of disease pathways, or molecular functions. A recent study by Liu and colleagues identified mutations in TCF20, a paralog of RAI1, among individuals manifesting a novel syndrome that has phenotypes similar to those of Smith-Magenis syndrome (a disorder caused by disruption of RAI1). This study highlights how structural similarity among genes contributes to shared phenotypes, and shows how this relationship can contribute to our understanding of the genetic basis of complex disorders.

  • Interchromosomal template-switching as a novel molecular mechanism for imprinting perturbations associated with Temple syndrome
    Genome Med. (IF 10.886) Pub Date : 2019-04-23
    Claudia M. B. Carvalho; Zeynep Coban-Akdemir; Hadia Hijazi; Bo Yuan; Matthew Pendleton; Eoghan Harrington; John Beaulaurier; Sissel Juul; Daniel J. Turner; Rupa S. Kanchi; Shalini N. Jhangiani; Donna M. Muzny; Richard A. Gibbs; Pawel Stankiewicz; John W. Belmont; Chad A. Shaw; Sau Wai Cheung; Neil A. Hanchard; V. Reid Sutton; Patricia I. Bader; James R. Lupski

    Intrachromosomal triplications (TRP) can contribute to disease etiology via gene dosage effects, gene disruption, position effects, or fusion gene formation. Recently, post-zygotic de novo triplications adjacent to copy-number neutral genomic intervals with runs of homozygosity (ROH) have been shown to result in uniparental isodisomy (UPD). The genomic structure of these complex genomic rearrangements (CGRs) shows a consistent pattern of an inverted triplication flanked by duplications (DUP-TRP/INV-DUP) formed by an iterative DNA replisome template-switching mechanism during replicative repair of a single-ended, double-stranded DNA (seDNA), the ROH results from an interhomolog or nonsister chromatid template switch. It has been postulated that these CGRs may lead to genetic abnormalities in carriers due to dosage-sensitive genes mapping within the copy-number variant regions, homozygosity for alleles at a locus causing an autosomal recessive (AR) disease trait within the ROH region, or imprinting-associated diseases. Here, we report a family wherein the affected subject carries a de novo 2.2-Mb TRP followed by 42.2 Mb of ROH and manifests clinical features overlapping with those observed in association with chromosome 14 maternal UPD (UPD(14)mat). UPD(14)mat can cause clinical phenotypic features enabling a diagnosis of Temple syndrome. This CGR was then molecularly characterized by high-density custom aCGH, genome-wide single-nucleotide polymorphism (SNP) and methylation arrays, exome sequencing (ES), and the Oxford Nanopore long-read sequencing technology. We confirmed the postulated DUP-TRP/INV-DUP structure by multiple orthogonal genomic technologies in the proband. The methylation status of known differentially methylated regions (DMRs) on chromosome 14 revealed that the subject shows the typical methylation pattern of UPD(14)mat. Consistent with these molecular findings, the clinical features overlap with those observed in Temple syndrome, including speech delay. These data provide experimental evidence that, in humans, triplication can lead to segmental UPD and imprinting disease. Importantly, genotype/phenotype analyses further reveal how a post-zygotically generated complex structural variant, resulting from a replication-based mutational mechanism, contributes to expanding the clinical phenotype of known genetic syndromes. Mechanistically, such events can distort transmission genetics resulting in homozygosity at a locus for which only one parent is a carrier as well as cause imprinting diseases.

  • Evidence from genome wide association studies implicates reduced control of Epstein-Barr virus infection in multiple sclerosis susceptibility
    Genome Med. (IF 10.886) Pub Date : 2019-04-30
    Ali Afrasiabi; Grant P. Parnell; Nicole Fewings; Stephen D. Schibeci; Monica A. Basuki; Ramya Chandramohan; Yuan Zhou; Bruce Taylor; David A. Brown; Sanjay Swaminathan; Fiona C. McKay; Graeme J. Stewart; David R. Booth

    Genome wide association studies have identified > 200 susceptibility loci accounting for much of the heritability of multiple sclerosis (MS). Epstein-Barr virus (EBV), a memory B cell tropic virus, has been identified as necessary but not sufficient for development of MS. The molecular and immunological basis for this has not been established. Infected B cell proliferation is driven by signalling through the EBV produced cell surface protein LMP1, a homologue of the MS risk gene CD40. We have investigated transcriptomes of B cells and EBV-infected B cells at Latency III (LCLs) and identified MS risk genes with altered expression on infection and with expression levels associated with the MS risk genotype (LCLeQTLs). The association of LCLeQTL genomic burden with EBV phenotypes in vitro and in vivo was examined. The risk genotype effect on LCL proliferation with CD40 stimulation was assessed. These LCLeQTL MS risk SNP:gene pairs (47 identified) were over-represented in genes dysregulated between B and LCLs (p < 1.53 × 10−4), and as target loci of the EBV transcription factor EBNA2 (p < 3.17 × 10−16). Overall genetic burden of LCLeQTLs was associated with some EBV phenotypes but not others. Stimulation of the CD40 pathway by CD40L reduced LCL proliferation (p < 0.001), dependent on CD40 and TRAF3 MS risk genotypes. Both CD40 and TRAF3 risk SNPs are in binding sites for the EBV transcription factor EBNA2, with expression of each correlated with EBNA2 expression dependent on genotype. These data indicate targeting EBV may be of therapeutic benefit in MS.

  • A modular transcriptome map of mature B cell lymphomas
    Genome Med. (IF 10.886) Pub Date : 2019-04-30
    Henry Loeffler-Wirth; Markus Kreuz; Lydia Hopp; Arsen Arakelyan; Andrea Haake; Sergio B. Cogliatti; Alfred C. Feller; Martin-Leo Hansmann; Dido Lenze; Peter Möller; Hans Konrad Müller-Hermelink; Erik Fortenbacher; Edith Willscher; German Ott; Andreas Rosenwald; Christiane Pott; Carsten Schwaenen; Heiko Trautmann; Swen Wessendorf; Harald Stein; Monika Szczepanowski; Lorenz Trümper; Michael Hummel; Wolfram Klapper; Reiner Siebert; Markus Loeffler; Hans Binder

    Germinal center-derived B cell lymphomas are tumors of the lymphoid tissues representing one of the most heterogeneous malignancies. Here we characterize the variety of transcriptomic phenotypes of this disease based on 873 biopsy specimens collected in the German Cancer Aid MMML (Molecular Mechanisms in Malignant Lymphoma) consortium. They include diffuse large B cell lymphoma (DLBCL), follicular lymphoma (FL), Burkitt’s lymphoma, mixed FL/DLBCL lymphomas, primary mediastinal large B cell lymphoma, multiple myeloma, IRF4-rearranged large cell lymphoma, MYC-negative Burkitt-like lymphoma with chr. 11q aberration and mantle cell lymphoma. We apply self-organizing map (SOM) machine learning to microarray-derived expression data to generate a holistic view on the transcriptome landscape of lymphomas, to describe the multidimensional nature of gene regulation and to pursue a modular view on co-expression. Expression data were complemented by pathological, genetic and clinical characteristics. We present a transcriptome map of B cell lymphomas that allows visual comparison between the SOM portraits of different lymphoma strata and individual cases. It decomposes into one dozen modules of co-expressed genes related to different functional categories, to genetic defects and to the pathogenesis of lymphomas. On a molecular level, this disease rather forms a continuum of expression states than clearly separated phenotypes. We introduced the concept of combinatorial pattern types (PATs) that stratifies the lymphomas into nine PAT groups and, on a coarser level, into five prominent cancer hallmark types with proliferation, inflammation and stroma signatures. Inflammation signatures in combination with healthy B cell and tonsil characteristics associate with better overall survival rates, while proliferation in combination with inflammation and plasma cell characteristics worsens it. A phenotypic similarity tree is presented that reveals possible progression paths along the transcriptional dimensions. Our analysis provided a novel look on the transition range between FL and DLBCL, on DLBCL with poor prognosis showing expression patterns resembling that of Burkitt’s lymphoma and particularly on ‘double-hit’ MYC and BCL2 transformed lymphomas. The transcriptome map provides a tool that aggregates, refines and visualizes the data collected in the MMML study and interprets them in the light of previous knowledge to provide orientation and support in current and future studies on lymphomas and on other cancer entities.

  • Multi-omics discovery of exome-derived neoantigens in hepatocellular carcinoma
    Genome Med. (IF 10.886) Pub Date : 2019-04-30
    Markus W. Löffler; Christopher Mohr; Leon Bichmann; Lena Katharina Freudenmann; Mathias Walzer; Christopher M. Schroeder; Nico Trautwein; Franz J. Hilke; Raphael S. Zinser; Lena Mühlenbruch; Daniel J. Kowalewski; Heiko Schuster; Marc Sturm; Jakob Matthes; Olaf Riess; Stefan Czemmel; Sven Nahnsen; Ingmar Königsrainer; Karolin Thiel; Silvio Nadalin; Stefan Beckert; Hans Bösmüller; Falko Fend; Ana Velic; Boris Maček; Sebastian P. Haen; Luigi Buonaguro; Oliver Kohlbacher; Stefan Stevanović; Alfred Königsrainer; Hans-Georg Rammensee

    Although mutated HLA ligands are considered ideal cancer-specific immunotherapy targets, evidence for their presentation is lacking in hepatocellular carcinomas (HCCs). Employing a unique multi-omics approach comprising a neoepitope identification pipeline, we assessed exome-derived mutations naturally presented as HLA class I ligands in HCCs. In-depth multi-omics analyses included whole exome and transcriptome sequencing to define individual patient-specific search spaces of neoepitope candidates. Evidence for the natural presentation of mutated HLA ligands was investigated through an in silico pipeline integrating proteome and HLA ligandome profiling data. The approach was successfully validated in a state-of-the-art dataset from malignant melanoma, and despite multi-omics evidence for somatic mutations, mutated naturally presented HLA ligands remained elusive in HCCs. An analysis of extensive cancer datasets confirmed fundamental differences of tumor mutational burden in HCC and malignant melanoma, challenging the notion that exome-derived mutations contribute relevantly to the expectable neoepitope pool in malignancies with only few mutations. This study suggests that exome-derived mutated HLA ligands appear to be rarely presented in HCCs, inter alia resulting from a low mutational burden as compared to other malignancies such as malignant melanoma. Our results therefore demand widening the target scope for personalized immunotherapy beyond this limited range of mutated neoepitopes, particularly for malignancies with similar or lower mutational burden.

  • Discovery and characterization of actionable tumor antigens
    Genome Med. (IF 10.886) Pub Date : 2019-04-30
    Grégory Ehx; Claude Perreault

    The nature of the tumor antigens that are detectable by T cells remains unclear. In melanoma, T cells were shown to react against major histocompatibility complex (MHC)-associated peptides (MAPs) that are derived from exonic mutations. A recent multi-omic study of hepatocellular carcinomas suggests, however, that mutated exonic MAPs were exceedingly rare, bringing the accuracy of the current methods for antigen identification into question and demonstrating the importance of broadening tumor-antigen discovery efforts.

  • Copy number variant and runs of homozygosity detection by microarrays enabled more precise molecular diagnoses in 11,020 clinical exome cases
    Genome Med. (IF 10.886) Pub Date : 2019-05-17
    Avinash V. Dharmadhikari; Rajarshi Ghosh; Bo Yuan; Pengfei Liu; Hongzheng Dai; Sami Al Masri; Jennifer Scull; Jennifer E. Posey; Allen H. Jiang; Weimin He; Francesco Vetrini; Alicia A. Braxton; Patricia Ward; Theodore Chiang; Chunjing Qu; Shen Gu; Chad A. Shaw; Janice L. Smith; Seema Lalani; Pawel Stankiewicz; Sau-Wai Cheung; Carlos A. Bacino; Ankita Patel; Amy M. Breman; Xia Wang; Linyan Meng; Rui Xiao; Fan Xia; Donna Muzny; Richard A. Gibbs; Arthur L. Beaudet; Christine M. Eng; James R. Lupski; Yaping Yang; Weimin Bi

    Exome sequencing (ES) has been successfully applied in clinical detection of single nucleotide variants (SNVs) and small indels. However, identification of copy number variants (CNVs) using ES data remains challenging. The purpose of this study is to understand the contribution of CNVs and copy neutral runs of homozygosity (ROH) in molecular diagnosis of patients referred for ES. In a cohort of 11,020 consecutive ES patients, an Illumina SNP array analysis interrogating mostly coding SNPs was performed as a quality control (QC) measurement and for CNV/ROH detection. Among these patients, clinical chromosomal microarray analysis (CMA) was performed at Baylor Genetics (BG) on 3229 patients, either before, concurrently, or after ES. We retrospectively analyzed the findings from CMA and the QC array. The QC array can detect ~ 70% of pathogenic/likely pathogenic CNVs (PCNVs) detectable by CMA. Out of the 11,020 ES cases, the QC array identified PCNVs in 327 patients and uniparental disomy (UPD) disorder-related ROH in 10 patients. The overall PCNV/UPD detection rate was 5.9% in the 3229 ES patients who also had CMA at BG; PCNV/UPD detection rate was higher in concurrent ES and CMA than in ES with prior CMA (7.2% vs 4.6%). The PCNVs/UPD contributed to the molecular diagnoses in 17.4% (189/1089) of molecularly diagnosed ES cases with CMA and were estimated to contribute in 10.6% of all molecularly diagnosed ES cases. Dual diagnoses with both PCNVs and SNVs were detected in 38 patients. PCNVs affecting single recessive disorder genes in a compound heterozygous state with SNVs were detected in 4 patients, and homozygous deletions (mostly exonic deletions) were detected in 17 patients. A higher PCNV detection rate was observed for patients with syndromic phenotypes and/or cardiovascular abnormalities. Our clinical genomics study demonstrates that detection of PCNV/UPD through the QC array or CMA increases ES diagnostic rate, provides more precise molecular diagnosis for dominant as well as recessive traits, and enables more complete genetic diagnoses in patients with dual or multiple molecular diagnoses. Concurrent ES and CMA using an array with exonic coverage for disease genes enables most effective detection of both CNVs and SNVs and therefore is recommended especially in time-sensitive clinical situations.

  • Points-to-consider on the return of results in epigenetic research
    Genome Med. (IF 10.886) Pub Date : 2019-05-23
    Stephanie O. M. Dyke; Katie M. Saulnier; Charles Dupras; Amy P. Webster; Karen Maschke; Mark Rothstein; Reiner Siebert; Jörn Walter; Stephan Beck; Tomi Pastinen; Yann Joly

    As epigenetic studies become more common and lead to new insights into health and disease, the return of individual epigenetic results to research participants, in particular in large-scale epigenomic studies, will be of growing importance. Members of the International Human Epigenome Consortium (IHEC) Bioethics Workgroup considered the potential ethical, legal, and social issues (ELSI) involved in returning epigenetic research results and incidental findings in order to produce a set of ‘Points-to-consider’ (P-t-C) for the epigenetics research community. These P-t-C draw on existing guidance on the return of genetic research results, while also integrating the IHEC Bioethics Workgroup’s ELSI research on and discussion of the issues associated with epigenetic data as well as the experience of a return of results pilot study by the Personal Genome Project UK (PGP-UK). Major challenges include how to determine the clinical validity and actionability of epigenetic results, and considerations related to environmental exposures and epigenetic marks, including circumstances warranting the sharing of results with family members and third parties. Interdisciplinary collaboration and good public communication regarding epigenetic risk will be important to advance the return of results framework for epigenetic science.

  • Dissecting lung development and fibrosis at single-cell resolution
    Genome Med. (IF 10.886) Pub Date : 2019-05-24
    Donna L. Farber; Peter A. Sims

    Single-cell transcriptome profiling has enabled high-resolution analysis of cellular populations in tissues during development, health, and disease. Recent studies make innovative use of single-cell RNA sequencing (scRNAseq) to investigate mechanisms that allow immune cells to interact with tissue components in the lung during development and fibrotic lung disease.

  • Molecular and pharmacological modulators of the tumor immune contexture revealed by deconvolution of RNA-seq data
    Genome Med. (IF 10.886) Pub Date : 2019-05-24
    Francesca Finotello; Clemens Mayer; Christina Plattner; Gerhard Laschober; Dietmar Rieder; Hubert Hackl; Anne Krogsdam; Zuzana Loncova; Wilfried Posch; Doris Wilflingseder; Sieghart Sopper; Marieke Ijsselsteijn; Thomas P. Brouwer; Douglas Johnson; Yaomin Xu; Yu Wang; Melinda E. Sanders; Monica V. Estrada; Paula Ericsson-Gonzalez; Pornpimol Charoentong; Justin Balko; Noel Filipe da Cunha Carvalho de Miranda; Zlatko Trajanoski

    We introduce quanTIseq, a method to quantify the fractions of ten immune cell types from bulk RNA-sequencing data. quanTIseq was extensively validated in blood and tumor samples using simulated, flow cytometry, and immunohistochemistry data. quanTIseq analysis of 8000 tumor samples revealed that cytotoxic T cell infiltration is more strongly associated with the activation of the CXCR3/CXCL9 axis than with mutational load and that deconvolution-based cell scores have prognostic value in several solid cancers. Finally, we used quanTIseq to show how kinase inhibitors modulate the immune contexture and to reveal immune-cell types that underlie differential patients’ responses to checkpoint blockers. Availability: quanTIseq is available at http://icbi.at/quantiseq.

  • Clinical utility of custom-designed NGS panel testing in pediatric tumors
    Genome Med. (IF 10.886) Pub Date : 2019-05-28
    Lea F. Surrey; Suzanne P. MacFarland; Fengqi Chang; Kajia Cao; Komal S. Rathi; Gozde T. Akgumus; Daniel Gallo; Fumin Lin; Adam Gleason; Pichai Raman; Richard Aplenc; Rochelle Bagatell; Jane Minturn; Yael Mosse; Mariarita Santi; Sarah K. Tasian; Angela J. Waanders; Mahdi Sarmady; John M. Maris; Stephen P. Hunger; Marilyn M. Li

    Somatic genetic testing is rapidly becoming the standard of care in many adult and pediatric cancers. Previously, the standard approach was single-gene or focused multigene testing, but many centers have moved towards broad-based next-generation sequencing (NGS) panels. Here, we report the laboratory validation and clinical utility of a large cohort of clinical NGS somatic sequencing results in diagnosis, prognosis, and treatment of a wide range of pediatric cancers. Subjects were accrued retrospectively at a single pediatric quaternary-care hospital. Sequence analyses were performed on 367 pediatric cancer samples using custom-designed NGS panels over a 15-month period. Cases were profiled for mutations, copy number variations, and fusions identified through sequencing, and their clinical impact on diagnosis, prognosis, and therapy was assessed. NGS panel testing was incorporated meaningfully into clinical care in 88.7% of leukemia/lymphomas, 90.6% of central nervous system (CNS) tumors, and 62.6% of non-CNS solid tumors included in this cohort. A change in diagnosis as a result of testing occurred in 3.3% of cases. Additionally, 19.4% of all patients had variants requiring further evaluation for potential germline alteration. Use of somatic NGS panel testing resulted in a significant impact on clinical care, including diagnosis, prognosis, and treatment planning in 78.7% of pediatric patients tested in our institution. Somatic NGS tumor testing should be implemented as part of the routine diagnostic workup of newly diagnosed and relapsed pediatric cancer patients.

  • Somatic mutation and clonal expansions in human tissues
    Genome Med. (IF 10.886) Pub Date : 2019-05-28
    Inigo Martincorena

    Recent sequencing studies on healthy skin and esophagus have found that, as we age, these tissues become colonized by mutant clones of cells carrying driver mutations in traditional cancer genes. This comment summarizes these findings and discusses their possible implications for our understanding of cancer, ageing, and other diseases.

  • Prognostic value of B cells in cutaneous melanoma
    Genome Med. (IF 10.886) Pub Date : 2019-05-28
    Sara R. Selitsky; Lisle E. Mose; Christof C. Smith; Shengjie Chai; Katherine A. Hoadley; Dirk P. Dittmer; Stergios J. Moschos; Joel S. Parker; Benjamin G. Vincent

    Measures of the adaptive immune response have prognostic and predictive associations in melanoma and other cancer types. Specifically, intratumoral T cell density and function have considerable prognostic and predictive value in skin cutaneous melanoma (SKCM). Less is known about the significance of tumor-infiltrating B cells in SKCM. Our goal was to understand the prognostic and predictive value of B cell phenotypic subsets in SKCM using RNA sequencing. We used our previously published algorithm, V’DJer, to assemble B cell receptor (BCR) repertoires and estimate diversity from short-read RNA sequencing (RNA-seq). We applied machine learning-based cellular phenotype classifiers to measure relative similarity of bulk tumor sample gene expression profiles and different B cell phenotypes. We assessed these aspects of B cell biology in 473 SKCM from the Cancer Genome Atlas Project (TCGA) as well as in RNA-seq data corresponding to tumor samples procured from patients who received CTLA-4 and PD-1 inhibitors for metastatic SKCM. We found that the BCR repertoire was associated with different clinical factors, such as tumor tissue site and sex. However, increased clonality of the BCR repertoire was favorably prognostic in SKCM and was prognostic even after first conditioning on various clinical factors. Mutation burden was not correlated with any BCR measurement, and no specific mutation had an altered BCR repertoire. Lack of an assembled BCR in pre-treatment tumor tissues was associated with a lack of anti-tumor response to a CTLA-4 inhibitor in metastatic SKCM. These findings suggest an important prognostic and predictive role for B cell characteristics in SKCM. This has implications for melanoma immunobiology and potential development of immunogenomics features to predict survival and response to immunotherapy.

  • Associating somatic mutations to clinical outcomes: a pan-cancer study of survival time
    Genome Med. (IF 10.886) Pub Date : 2019-05-28
    Paul Little; Dan-Yu Lin; Wei Sun

    We developed subclone multiplicity allocation and somatic heterogeneity (SMASH), a new statistical method for intra-tumor heterogeneity (ITH) inference. SMASH is tailored to the purpose of large-scale association studies with one tumor sample per patient. In a pan-cancer study of 14 cancer types, we studied the associations between survival time and ITH quantified by SMASH, together with other features of somatic mutations. Our results show that ITH is associated with survival time in several cancer types and its effect can be modified by other covariates, such as mutation burden. SMASH is available at https://github.com/Sun-lab/SMASH .

  • Exome sequencing in routine diagnostics: a generic test for 254 patients with primary immunodeficiencies
    Genome Med. (IF 10.886) Pub Date : 2019-06-17
    Peer Arts; Annet Simons; Mofareh S. AlZahrani; Elanur Yilmaz; Eman AlIdrissi; Koen J. van Aerde; Njood Alenezi; Hamza A. AlGhamdi; Hadeel A. AlJubab; Abdulrahman A. Al-Hussaini; Fahad AlManjomi; Alaa B. Alsaad; Badr Alsaleem; Abdulrahman A. Andijani; Ali Asery; Walid Ballourah; Chantal P. Bleeker-Rovers; Marcel van Deuren; Michiel van der Flier; Erica H. Gerkes; Christian Gilissen; Murad K. Habazi; Jayne Y. Hehir-Kwa; Stefanie S. Henriet; Esther P. Hoppenreijs; Sarah Hortillosa; Chantal H. Kerkhofs; Riikka Keski-Filppula; Stefan H. Lelieveld; Khurram Lone; Marius A. MacKenzie; Arjen R. Mensenkamp; Jukka Moilanen; Marcel Nelen; Jaap ten Oever; Judith Potjewijd; Pieter van Paassen; Janneke H. M. Schuurs-Hoeijmakers; Anna Simon; Tomasz Stokowy; Maartje van de Vorst; Maaike Vreeburg; Anja Wagner; Gijs T. J. van Well; Dimitra Zafeiropoulou; Evelien Zonneveld-Huijssoon; Joris A. Veltman; Wendy A. G. van Zelst-Stams; Eissa A. Faqeih; Frank L. van de Veerdonk; Mihai G. Netea; Alexander Hoischen

    Diagnosis of primary immunodeficiencies (PIDs) is complex and cumbersome yet important for the clinical management of the disease. Exome sequencing may provide a genetic diagnosis in a significant number of patients in a single genetic test. In May 2013, we implemented exome sequencing in routine diagnostics for patients suffering from PIDs. This study reports the clinical utility and diagnostic yield for a heterogeneous group of 254 consecutively referred PID patients from 249 families. For the majority of patients, the clinical diagnosis was based on clinical criteria including rare and/or unusual severe bacterial, viral, or fungal infections, sometimes accompanied by autoimmune manifestations. Functional immune defects were interpreted in the context of aberrant immune cell populations, aberrant antibody levels, or combinations of these factors. For 62 patients (24%), exome sequencing identified pathogenic variants in well-established PID genes. An exome-wide analysis diagnosed 10 additional patients (4%), providing diagnoses for 72 patients (28%) from 68 families altogether. The genetic diagnosis directly indicated novel treatment options for 25 patients that received a diagnosis (34%). Exome sequencing as a first-tier test for PIDs granted a diagnosis for 28% of patients. Importantly, molecularly defined diagnoses indicated altered therapeutic options in 34% of cases. In addition, exome sequencing harbors advantages over gene panels as a truly generic test for all genetic diseases, including in silico extension of existing gene lists and re-analysis of existing data.

  • Mechanisms of immune-related adverse events associated with immune checkpoint blockade: using germline genetics to develop a personalized approach
    Genome Med. (IF 10.886) Pub Date : 2019-06-20
    Zia Khan; Christian Hammer; Ellie Guardino; G. Scott Chandler; Matthew L. Albert

    Personalized care of cancer patients undergoing treatment with immune checkpoint inhibitors will require approaches that can predict their susceptibility to immune-related adverse events. Understanding the role of germline genetic factors in determining individual responses to immunotherapy will deepen our understanding of immune toxicity and, importantly, it may lead to tools for identifying patients who are at risk.

  • Radiation therapy and anti-tumor immunity: exposing immunogenic mutations to the immune system
    Genome Med. (IF 10.886) Pub Date : 2019-06-20
    Claire Lhuillier; Nils-Petter Rudqvist; Olivier Elemento; Silvia C. Formenti; Sandra Demaria

    The expression of antigens that are recognized by self-reactive T cells is essential for immune-mediated tumor rejection by immune checkpoint blockade (ICB) therapy. Growing evidence suggests that mutation-associated neoantigens drive ICB responses in tumors with high mutational burden. In most patients, only a few of the mutations in the cancer exome that are predicted to be immunogenic are recognized by T cells. One factor that limits this recognition is the level of expression of the mutated gene product in cancer cells. Substantial preclinical data show that radiation can convert the irradiated tumor into a site for priming of tumor-specific T cells, that is, an in situ vaccine, and can induce responses in otherwise ICB-resistant tumors. Critical for radiation-elicited T-cell activation is the induction of viral mimicry, which is mediated by the accumulation of cytosolic DNA in the irradiated cells, with consequent activation of the cyclic GMP-AMP synthase (cGAS)/stimulator of interferon (IFN) genes (STING) pathway and downstream production of type I IFN and other pro-inflammatory cytokines. Recent data suggest that radiation can also enhance cancer cell antigenicity by upregulating the expression of a large number of genes that are involved in the response to DNA damage and cellular stress, thus potentially exposing immunogenic mutations to the immune system. Here, we discuss how the principles of antigen presentation favor the presentation of peptides that are derived from newly synthesized proteins in irradiated cells. These concepts support a model that incorporates the presence of immunogenic mutations in genes that are upregulated by radiation to predict which patients might benefit from treatment with combinations of radiotherapy and ICB.

  • Integrating informatics tools and portable sequencing technology for rapid detection of resistance to anti-tuberculous drugs
    Genome Med. (IF 10.886) Pub Date : 2019-06-24
    Jody E. Phelan; Denise M. O’Sullivan; Diana Machado; Jorge Ramos; Yaa E. A. Oppong; Susana Campino; Justin O’Grady; Ruth McNerney; Martin L. Hibberd; Miguel Viveiros; Jim F. Huggett; Taane G. Clark

    Mycobacterium tuberculosis resistance to anti-tuberculosis drugs is a major threat to global public health. Whole genome sequencing (WGS) is rapidly gaining traction as a diagnostic tool for clinical tuberculosis settings. To support this informatically, previous work led to the development of the widely used TBProfiler webtool, which predicts resistance to 14 drugs from WGS data. However, for accurate and rapid high throughput of samples in clinical or epidemiological settings, there is a need for a stand-alone tool and the ability to analyse data across multiple WGS platforms, including Oxford Nanopore MinION. We present a new command line version of the TBProfiler webserver, which includes hetero-resistance calling and will facilitate the batch processing of samples. The TBProfiler database has been expanded to incorporate 178 new markers across 16 anti-tuberculosis drugs. The predictive performance of the mutation library has been assessed using > 17,000 clinical isolates with WGS and laboratory-based drug susceptibility testing (DST) data. An integrated MinION analysis pipeline was assessed by performing WGS on 34 replicates across 3 multi-drug resistant isolates with known resistance mutations. TBProfiler accuracy varied by individual drug. Assuming DST as the gold standard, sensitivities for detecting multi-drug-resistant TB (MDR-TB) and extensively drug-resistant TB (XDR-TB) were 94% (95%CI 93–95%) and 83% (95%CI 79–87%) with specificities of 98% (95%CI 98–99%) and 96% (95%CI 95–97%) respectively. Using MinION data, only one resistance mutation was missed by TBProfiler, involving an insertion in the tlyA gene coding for capreomycin resistance. When compared to alternative platforms (e.g. Mykrobe predictor TB, the CRyPTIC library), TBProfiler demonstrated superior predictive performance across first- and second-line drugs. The new version of TBProfiler can rapidly and accurately predict anti-TB drug resistance profiles across large numbers of samples with WGS data. The computing architecture allows for the ability to modify the core bioinformatic pipelines and outputs, including the analysis of WGS data sourced from portable technologies. TBProfiler has the potential to be integrated into the point of care and WGS diagnostic environments, including in resource-poor settings.

  • Evolving neoantigen profiles in colorectal cancers with DNA repair defects
    Genome Med. (IF 10.886) Pub Date : 2019-06-28
    Giuseppe Rospo; Annalisa Lorenzato; Nabil Amirouchene-Angelozzi; Alessandro Magrì; Carlotta Cancelliere; Giorgio Corti; Carola Negrino; Vito Amodio; Monica Montone; Alice Bartolini; Ludovic Barault; Luca Novara; Claudio Isella; Enzo Medico; Andrea Bertotti; Livio Trusolino; Giovanni Germano; Federica Di Nicolantonio; Alberto Bardelli

    Neoantigens that arise as a consequence of tumor-specific mutations can be recognized by T lymphocytes leading to effective immune surveillance. In colorectal cancer (CRC) and other tumor types, a high number of neoantigens is associated with patient response to immune therapies. The molecular processes governing the generation of neoantigens and their turnover in cancer cells are poorly understood. We exploited CRC as a model system to understand how alterations in DNA repair pathways modulate neoantigen profiles over time. We performed whole exome sequencing (WES) and RNA sequencing (RNAseq) in CRC cell lines, in vitro and in vivo, and in CRC patient-derived xenografts (PDXs) to track longitudinally genomic profiles, clonal evolution, mutational signatures, and predicted neoantigens. The majority of CRC models showed remarkably stable mutational and neoantigen profiles; however, those carrying defects in DNA repair genes continuously diversified. Rapidly evolving and evolutionary stable CRCs displayed characteristic genomic signatures and transcriptional profiles. Downregulation of molecules implicated in antigen presentation occurred selectively in highly mutated and rapidly evolving CRC. These results indicate that CRCs carrying alterations in DNA repair pathways display dynamic neoantigen patterns that fluctuate over time. We define CRC subsets characterized by slow and fast evolvability and link this phenotype to downregulation of antigen-presenting cellular mechanisms. Longitudinal monitoring of the neoantigen landscape could be relevant in the context of precision medicine.

  • The good, the bad, and the ugly: hyperprogression in cancer patients following immune checkpoint therapy
    Genome Med. (IF 10.886) Pub Date : 2019-07-24
    Erich Sabio; Timothy A. Chan

    Immune checkpoint blockade therapy can elicit robust and durable responses in a variety of cancer types. While many patients do not respond, recent reports highlight a distinct group of patients whose tumors undergo rapid growth, leading to progressive disease and poor outcome. In this perspective, we synthesize and summarize some important issues surrounding hyperprogression, defining characteristics, prognostic implications, and controversies.

  • Identification of intermediate-sized deletions and inference of their impact on gene expression in a human population
    Genome Med. (IF 10.886) Pub Date : 2019-07-24
    Jing Hao Wong; Daichi Shigemizu; Yukiko Yoshii; Shintaro Akiyama; Azusa Tanaka; Hidewaki Nakagawa; Shu Narumiya; Akihiro Fujimoto

    Next-generation sequencing has allowed for the identification of different genetic variations, which are known to contribute to diseases. Of these, insertions and deletions are the second most abundant type of variations in the genome, but their biological importance or disease association is not well-studied, especially for deletions of intermediate sizes. We identified intermediate-sized deletions from whole-genome sequencing (WGS) data of Japanese samples (n = 174) with a novel deletion calling method which considered multiple samples. These deletions were used to construct a reference panel for use in imputation. Imputation was then conducted using the reference panel and data from 82 publically available Japanese samples with gene expression data. The accuracy of the deletion calling and imputation was examined with Nanopore long-read sequencing technology. We also conducted an expression quantitative trait loci (eQTL) association analysis using the deletions to infer their functional impacts on genes, before characterizing the deletions causal for gene expression level changes. We obtained a set of polymorphic 4378 high-confidence deletions and constructed a reference panel. The deletions were successfully imputed into the Japanese samples with high accuracy (97.3%). The eQTL analysis identified 181 deletions (4.1%) suggested as causal for gene expression level changes. The causal deletion candidates were significantly enriched in promoters, super-enhancers, and transcription elongation chromatin states. Generation of deletions in a cell line with the CRISPR-Cas9 system confirmed that they were indeed causative variants for gene expression change. Furthermore, one of the deletions was observed to affect the gene expression levels of a gene it was not located in. This paper reports an accurate deletion calling method for genotype imputation at the whole genome level and shows the importance of intermediate-sized deletions in the human population.

  • Deciphering drug resistance in Mycobacterium tuberculosis using whole-genome sequencing: progress, promise, and challenges
    Genome Med. (IF 10.886) Pub Date : 2019-07-25
    Keira A. Cohen; Abigail L. Manson; Christopher A. Desjardins; Thomas Abeel; Ashlee M. Earl

    Tuberculosis (TB) is a global infectious threat that is intensified by an increasing incidence of highly drug-resistant disease. Whole-genome sequencing (WGS) studies of Mycobacterium tuberculosis, the causative agent of TB, have greatly increased our understanding of this pathogen. Since the first M. tuberculosis genome was published in 1998, WGS has provided a more complete account of the genomic features that cause resistance in populations of M. tuberculosis, has helped to fill gaps in our knowledge of how both classical and new antitubercular drugs work, and has identified specific mutations that allow M. tuberculosis to escape the effects of these drugs. WGS studies have also revealed how resistance evolves both within an individual patient and within patient populations, including the important roles of de novo acquisition of resistance and clonal spread. These findings have informed decisions about which drug-resistance mutations should be included on extended diagnostic panels. From its origins as a basic science technique, WGS of M. tuberculosis is becoming part of the modern clinical microbiology laboratory, promising rapid and improved detection of drug resistance, and detailed and real-time epidemiology of TB outbreaks. We review the successes and highlight the challenges that remain in applying WGS to improve the control of drug-resistant TB through monitoring its evolution and spread, and to inform more rapid and effective diagnostic and therapeutic strategies.

  • Implementation of a genomic medicine multi-disciplinary team approach for rare disease in the clinical setting: a prospective exome sequencing case series
    Genome Med. (IF 10.886) Pub Date : 2019-07-25
    John Taylor; Jude Craft; Edward Blair; Sarah Wordsworth; David Beeson; Saleel Chandratre; Judith Cossins; Tracy Lester; Andrea H. Németh; Elizabeth Ormondroyd; Smita Y. Patel; Alistair T. Pagnamenta; Jenny C. Taylor; Kate L. Thomson; Hugh Watkins; Andrew O. M. Wilkie; Julian C. Knight

    A multi-disciplinary approach to promote engagement, inform decision-making and support clinicians and patients is increasingly advocated to realise the potential of genome-scale sequencing in the clinic for patient benefit. Here we describe the results of establishing a genomic medicine multi-disciplinary team (GM-MDT) for case selection, processing, interpretation and return of results. We report a consecutive case series of 132 patients (involving 10 medical specialties with 43.2% cases having a neurological disorder) undergoing exome sequencing over a 10-month period following the establishment of the GM-MDT in a UK NHS tertiary referral hospital. The costs of running the MDT are also reported. In total 76 cases underwent exome sequencing following triage by the GM-MDT with a clinically reportable molecular diagnosis in 24 (31.6%). GM-MDT composition, operation and rationale for whether to proceed to sequencing are described, together with the health economics (cost per case for the GM-MDT was £399.61), the utility and informativeness of exome sequencing for molecular diagnosis in a range of traits, the impact of choice of sequencing strategy on molecular diagnostic rates and challenge of defining pathogenic variants. In 5 cases (6.6%), an alternative clinical diagnosis was indicated by sequencing results. Examples were also found where findings from initial genetic testing were reconsidered in the light of exome sequencing including TP63 and PRKAG2 (detection of a partial exon deletion and a mosaic missense pathogenic variant respectively); together with tissue-specific mosaicism involving a cytogenetic abnormality following a normal prenatal array comparative genomic hybridization. This consecutive case series describes the results and experience of a multidisciplinary team format that was found to promote engagement across specialties and facilitate return of results to the responsible clinicians.

  • A clinical survey of mosaic single nucleotide variants in disease-causing genes detected by exome sequencing
    Genome Med. (IF 10.886) Pub Date : 2019-07-26
    Ye Cao; Mari J. Tokita; Edward S. Chen; Rajarshi Ghosh; Tiansheng Chen; Yanming Feng; Elizabeth Gorman; Federica Gibellini; Patricia A. Ward; Alicia Braxton; Xia Wang; Linyan Meng; Rui Xiao; Weimin Bi; Fan Xia; Christine M. Eng; Yaping Yang; Tomasz Gambin; Chad Shaw; Pengfei Liu; Pawel Stankiewicz

    Although mosaic variation has been known to cause disease for decades, high-throughput sequencing technologies with the analytical sensitivity to consistently detect variants at reduced allelic fractions have only recently emerged as routine clinical diagnostic tests. To date, few systematic analyses of mosaic variants detected by diagnostic exome sequencing for diverse clinical indications have been performed. To investigate the frequency, type, allelic fraction, and phenotypic consequences of clinically relevant somatic mosaic single nucleotide variants (SNVs) and characteristics of the corresponding genes, we retrospectively queried reported mosaic variants from a cohort of ~ 12,000 samples submitted for clinical exome sequencing (ES) at Baylor Genetics. We found 120 mosaic variants involving 107 genes, including 80 mosaic SNVs in proband samples and 40 in parental/grandparental samples. Average mosaic alternate allele fraction (AAF) detected in autosomes and in X-linked disease genes in females was 18.2% compared with 34.8% in X-linked disease genes in males. Of these mosaic variants, 74 variants (61.7%) were classified as pathogenic or likely pathogenic and 46 (38.3%) as variants of uncertain significance. Mosaic variants occurred in disease genes associated with autosomal dominant (AD) or AD/autosomal recessive (AR) (67/120, 55.8%), X-linked (33/120, 27.5%), AD/somatic (10/120, 8.3%), and AR (8/120, 6.7%) inheritance. Of note, 1.7% (2/120) of variants were found in genes in which only somatic events have been described. Nine genes had recurrent mosaic events in unrelated individuals which accounted for 18.3% (22/120) of all detected mosaic variants in this study. The proband group was enriched for mosaicism affecting Ras signaling pathway genes. In sum, an estimated 1.5% of all molecular diagnoses made in this cohort could be attributed to a mosaic variant detected in the proband, while parental mosaicism was identified in 0.3% of families analyzed. As ES design favors breadth over depth of coverage, this estimate of the prevalence of mosaic variants likely represents an underestimate of the total number of clinically relevant mosaic variants in our cohort.

  • Hidden Markov models lead to higher resolution maps of mutation signature activity in cancer
    Genome Med. (IF 10.886) Pub Date : 2019-07-26
    Damian Wojtowicz; Itay Sason; Xiaoqing Huang; Yoo-Ah Kim; Mark D. M. Leiserson; Teresa M. Przytycka; Roded Sharan

    Knowing the activity of the mutational processes shaping a cancer genome may provide insight into tumorigenesis and personalized therapy. It is thus important to characterize the signatures of active mutational processes in patients from their patterns of single base substitutions. However, mutational processes do not act uniformly on the genome, leading to statistical dependencies among neighboring mutations. To account for such dependencies, we develop the first sequence-dependent model, SigMa, for mutation signatures. We apply SigMa to characterize genomic and other factors that influence the activity of mutation signatures in breast cancer. We show that SigMa outperforms previous approaches, revealing novel insights on signature etiology. The source code for SigMa is publicly available at https://github.com/lrgr/sigma .

  • Correction to: Molecular and pharmacological modulators of the tumor immune contexture revealed by deconvolution of RNA-seq data
    Genome Med. (IF 10.886) Pub Date : 2019-07-29
    Francesca Finotello; Clemens Mayer; Christina Plattner; Gerhard Laschober; Dietmar Rieder; Hubert Hackl; Anne Krogsdam; Zuzana Loncova; Wilfried Posch; Doris Wilflingseder; Sieghart Sopper; Marieke Ijsselsteijn; Thomas P. Brouwer; Douglas Johnson; Yaomin Xu; Yu Wang; Melinda E. Sanders; Monica V. Estrada; Paula Ericsson-Gonzalez; Pornpimol Charoentong; Justin Balko; Noel Filipe da Cunha Carvalho de Miranda; Zlatko Trajanoski

    It was highlighted that the original article [1] contained a typesetting mistake in the name of Noel Filipe da Cunha Carvalho de Miranda. This was incorrectly captured as Noel Filipe da Cunha Carvahlo de Miranda. It was also highlighted that in Fig. 3C the left panels Y-axis were cropped and in Fig. 5C, CD8 bar was cropped. This Correction article shows the correct Figs. 3 and 5. The original article has been updated.

  • A validated single-cell-based strategy to identify diagnostic and therapeutic targets in complex diseases
    Genome Med. (IF 10.886) Pub Date : 2019-07-30
    Danuta R. Gawel; Jordi Serra-Musach; Sandra Lilja; Jesper Aagesen; Alex Arenas; Bengt Asking; Malin Bengnér; Janne Björkander; Sophie Biggs; Jan Ernerudh; Henrik Hjortswang; Jan-Erik Karlsson; Mattias Köpsen; Eun Jung Lee; Antonio Lentini; Xinxiu Li; Mattias Magnusson; David Martínez-Enguita; Andreas Matussek; Colm E. Nestor; Samuel Schäfer; Oliver Seifert; Ceylan Sonmez; Henrik Stjernman; Andreas Tjärnberg; Simon Wu; Karin Åkesson; Alex K. Shalek; Margaretha Stenmarker; Huan Zhang; Mika Gustafsson; Mikael Benson

    Genomic medicine has paved the way for identifying biomarkers and therapeutically actionable targets for complex diseases, but is complicated by the involvement of thousands of variably expressed genes across multiple cell types. Single-cell RNA-sequencing study (scRNA-seq) allows the characterization of such complex changes in whole organs. The study is based on applying network tools to organize and analyze scRNA-seq data from a mouse model of arthritis and human rheumatoid arthritis, in order to find diagnostic biomarkers and therapeutic targets. Diagnostic validation studies were performed using expression profiling data and potential protein biomarkers from prospective clinical studies of 13 diseases. A candidate drug was examined by a treatment study of a mouse model of arthritis, using phenotypic, immunohistochemical, and cellular analyses as read-outs. We performed the first systematic analysis of pathways, potential biomarkers, and drug targets in scRNA-seq data from a complex disease, starting with inflamed joints and lymph nodes from a mouse model of arthritis. We found the involvement of hundreds of pathways, biomarkers, and drug targets that differed greatly between cell types. Analyses of scRNA-seq and GWAS data from human rheumatoid arthritis (RA) supported a similar dispersion of pathogenic mechanisms in different cell types. Thus, systems-level approaches to prioritize biomarkers and drugs are needed. Here, we present a prioritization strategy that is based on constructing network models of disease-associated cell types and interactions using scRNA-seq data from our mouse model of arthritis, as well as human RA, which we term multicellular disease models (MCDMs). We find that the network centrality of MCDM cell types correlates with the enrichment of genes harboring genetic variants associated with RA and thus could potentially be used to prioritize cell types and genes for diagnostics and therapeutics. We validated this hypothesis in a large-scale study of patients with 13 different autoimmune, allergic, infectious, malignant, endocrine, metabolic, and cardiovascular diseases, as well as a therapeutic study of the mouse arthritis model. Overall, our results support that our strategy has the potential to help prioritize diagnostic and therapeutic targets in human disease.

  • Novel risk genes and mechanisms implicated by exome sequencing of 2572 individuals with pulmonary arterial hypertension
    Genome Med. (IF 10.886) Pub Date : 2019-11-14
    Na Zhu; Michael W. Pauciulo; Carrie L. Welch; Katie A. Lutz; Anna W. Coleman; Claudia Gonzaga-Jauregui; Jiayao Wang; Joseph M. Grimes; Lisa J. Martin; Hua He; Yufeng Shen; Wendy K. Chung; William C. Nichols

    Group 1 pulmonary arterial hypertension (PAH) is a rare disease with high mortality despite recent therapeutic advances. Pathogenic remodeling of pulmonary arterioles leads to increased pulmonary pressures, right ventricular hypertrophy, and heart failure. Mutations in bone morphogenetic protein receptor type 2 and other risk genes predispose to disease, but the vast majority of non-familial cases remain genetically undefined. To identify new risk genes, we performed exome sequencing in a large cohort from the National Biological Sample and Data Repository for PAH (PAH Biobank, n = 2572). We then carried out rare deleterious variant identification followed by case-control gene-based association analyses. To control for population structure, only unrelated European cases (n = 1832) and controls (n = 12,771) were used in association tests. Empirical p values were determined by permutation analyses, and the threshold for significance defined by Bonferroni’s correction for multiple testing. Tissue kallikrein 1 (KLK1) and gamma glutamyl carboxylase (GGCX) were identified as new candidate risk genes for idiopathic PAH (IPAH) with genome-wide significance. We note that variant carriers had later mean age of onset and relatively moderate disease phenotypes compared to bone morphogenetic receptor type 2 variant carriers. We also confirmed the genome-wide association of recently reported growth differentiation factor (GDF2) with IPAH and further implicate T-box 4 (TBX4) with child-onset PAH. We report robust association of novel genes KLK1 and GGCX with IPAH, accounting for ~ 0.4% and 0.9% of PAH Biobank cases, respectively. Both genes play important roles in vascular hemodynamics and inflammation but have not been implicated in PAH previously. These data suggest new genes, pathogenic mechanisms, and therapeutic targets for this lethal vasculopathy.

Contents have been reproduced by permission of the publishers.
上海纽约大学William Glover