当前位置: X-MOL 学术Genome Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Keeping up with the genomes: scaling genomic variant interpretation.
Genome Medicine ( IF 10.4 ) Pub Date : 2019-12-31 , DOI: 10.1186/s13073-019-0700-4
Heidi L Rehm 1, 2, 3 , Douglas M Fowler 4, 5, 6
Affiliation  

In the past 10 years, we have seen major advances in our ability to read human genomic DNA and detect variation. The variants we find have the potential to improve the diagnosis and treatment of human disease and also to define our unique traits. Although slower to catch up, we are now seeing equally rapid advances in the strategies used to interpret these variants in both coding and non-coding regions. Setting up a robust infrastructure, in terms of sequencing technology, pipelines for detection of all clinically significant variation, and analysis tools that incorporate the most effective approaches to variant interpretation, will be critical in delivering widespread and meaningful advances in patient care and in ensuring the accurate and informative application of genomic technology to healthcare.

Many platforms have been developed to detect different types of DNA variants in the germline and in the context of somatic cancer and mosaicism. For example, short read, next generation sequencing is routinely employed to detect short sequence variants, whereas Sanger sequencing is still used to confirm many variants. Karyotyping and chromosomal microarrays are platforms that are commonly used to detect structural variants. In addition, a myriad of other platforms and assays are used to detect partial gene deletions and duplications, common translocations, repeat expansions, and gene amplifications and to discern variation in homologous regions. Yet, maintaining these many platforms to detect the multitude of human variation is complex, costly, and difficult for laboratories, clinicians, and patients to navigate.

In this special issue of Genome Medicine, Lindstrand and colleagues [1] demonstrate the ability of whole genome sequencing to consolidate many of these platforms into a single approach for detecting a wide range of human variation types. The next step will be to democratize the computational tools needed to identify and annotate the different types of variation accurately, so that every laboratory that can generate a whole human genome sequence will be capable of highly sensitive and specific detection of all types of human genomic variation that have clinical consequences.

Although the detection of human genetic variation is a necessary first step, many resources are needed to support the accurate interpretation of the identified variation. The human population is genetically diverse, both in the spectrum of benign variation and in variation implicated in disease. In this issue, Abul-Husn and colleagues [2] report an increased rate of variants of uncertain significance in non-European populations compared to European ones, particularly in populations with a higher proportion of African ancestry. This burden of variants of uncertain significance results from a lack of recruitment from underrepresented populations, which has created a paucity of knowledge of disease causality in these populations. Diverse cohorts of affected individuals in disease studies are therefore needed to build knowledge of genetic disease etiologies across all populations and to ensure equitable benefit to all individuals from genomic medicine. The findings reported by Abul-Husn and colleagues [2] also highlight how large and diverse catalogs of human genetic variation across geographical populations are critical for ruling out the possibility that variants that are rare in one population but commonly observed in another are disease causing.

Also critical for variant interpretation are rigorous approaches for assessing the diversity of functional assays that are used to discern which variants disrupt the function of a gene product and which do not. This task is difficult because most gene products have a plethora of functions, sometimes in diverse cell types or even in an organismal context. In this special issue, Brnich and colleagues [3] propose a rigorous strategy to ensure that functional assays are well-validated before the data they generate are applied to routine clinical interpretation of variants. These recommendations have been developed for the evaluation and application of functional evidence within the ACMG/AMP variant interpretation framework [4], and are a key step forward in reducing discordance in the application of evidence codes.

Furthermore, once a functional assay has been validated, it can be multiplexed to enable comprehensive assessment of the effects of one or more classes of variation, thereby enabling streamlined and accurate genetic interpretation. Multiplexed functional assays are particularly useful for assessing classes of variation that are difficult to interpret, such as missense and splice site variation. Although promising, multiplexed functional assays present a set of unique challenges for both the researchers that develop them and the clinicians using the functional data they produce. Thus, Gelman and colleagues [5] make recommendations for how the developers of multiplexed functional assays should evaluate assay performance and report assay results. They also provide guidance to clinicians on how the quality and clinical utility of large-scale functional datasets can be evaluated, and on how these data can be incorporated into routine variant interpretation.

Traditional approaches to identify the genetic causes of rare disease continue to yield novel gene discoveries, including aggregating cases with extremely rare, highly penetrant phenotypes that share common disrupted candidate genes. Nevertheless, other human diseases have been harder to tackle because they are defined by nonspecific phenotypes or because they arise from variants at multiple loci. Examples include autism and congenital heart disease. However, with the ability to sequence both disease and control cohorts of individuals at scale, including trios that enable the detection of de novo variation, statistical frameworks are now able to highlight candidate disease loci with increasing precision. Lal and colleagues describe combined de novo burden analysis with grouping of paralogous genes to enable the identification of 28 strong candidate genes for neurodevelopmental disorders. Notably, these candidates are expressed in the brain and exhibit evolutionary constraint [6]. Another challenge is the interpretation of balanced structural variation, where possible drivers of pathogenicity are difficult to identify. Using a combination of experimental and computational approaches examining both direct disruption and indirect, chromatin-mediated effects, Middelkamp and colleagues [7] prioritized causal genes for previously uninterpretable de novo structural variants that were identified in the context of congenital abnormality or intellectual disability. In summary, the large scale aggregation of well-phenotyped individuals with diseases, through data sharing programs and the application of innovative methods of analysis, we will eventually build a comprehensive understanding of the genes and genomic regions that contribute to human disease.

The interpretation of rare disease genetic variation has been hugely aided by systematic guidance [4] and by the routine sharing of variant interpretations in ClinVar. More recently, guidelines have been released to provide initial guidance for the interpretation of somatic variants, taking into account the added complexity of multiple dimensions of clinical relevance, including diagnosis, prognosis, and drug responsiveness [8]. These guidelines have better enabled the cancer community to standardize cancer variant assessment and to build shared community resources. These improvements are critical because they can empower the rapidly growing application of genetic testing in cancers, the results of which are critical to accurate prognosis and treatment guidance. In this issue, Lever and colleagues [9] demonstrate a text-mining approach to gather data from the literature on thousands of biomarkers and to deposit the information in a publicly accessible database called CIViCmine. He and colleagues [10] apply computational approaches to consume pre-annotated files and to apply criteria for clinical assessment. Both approaches enable the prioritization of variants identified in tumors for further review. Furthermore, Danos and colleagues [11] describe improvements to CIViC, which is an open platform for community curation of somatic variation. These improvements, which include common data models and standard operating procedures, are designed to support consistent and accurate interpretation of variants in cancer.

As genomic medicine success stories continue to appear, we will confront an ever-growing number of genomes to analyze and genetic variants to interpret. Both tasks are difficult because of the complexity of the human genome and its diversity of variants, as well as the challenge of amassing sufficient data to interpret variants. This special issue describes some of the advances in variant detection, scaling of experiments, improvements in computational approaches, and construction of community resources that are helping to confront these challenges. Although this progress is promising, more work is needed. For example, we must develop an inexpensive, widely deployed pipeline for assembling whole genome sequences and detecting variants. We must apply such a pipeline to diverse human populations, at scale, in order to understand the true extent of common genetic variation. We must deploy multiplexed functional assays to quantify the effect of variation at many, if not most, disease-associated loci. Finally, we must unite these resources by adopting a coherent set of standards and a rigorous culture of data sharing. If successful, we will enable all individuals to benefit from the routine application of genomics to both disease diagnosis and genome-enabled disease prevention.

  1. 1.

    Lindstrand A, Eisfeldt J, Pettersson M, Carvalho CMB, Kvarnung M, Grigelioniene G, et al. From cytogenetics to cytogenomics: whole-genome sequencing as a first-line test comprehensively captures the diverse spectrum of disease-causing genetic variation underlying intellectual disability. Genome Med. 2019;11:68.

    • Article
    • Google Scholar
  2. 2.

    Abul-Husn NS, Soper ER, Odgis JA, Cullina S, Bobo D, Moscati A, et al. Exome sequencing reveals a high prevalence of BRCA1 and BRCA2 founder variants in a diverse population-based biobank. Genome Med. 2019. https://doi.org/10.1186/s13073-019-0691-1

  3. 3.

    Brnich SA, Abou Tayoun AN, Couch FJ, Cutting G, Greenblatt MS, Heinen CD, et al. Recommendations for application of the functional evidence PS3/BS3 criterion using the ACMG/AMP sequence variant interpretation framework. Genome Med. 2019. https://doi.org/10.1186/s13073-019-0690-2

  4. 4.

    Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17:405–24.

    • Article
    • Google Scholar
  5. 5.

    Gelman et al. Recommendations for the collection and use of multiplexed functional data for clinical variant interpretation. Genome Med. 2019. https://doi.org/10.1186/s13073-019-0698-7

  6. 6.

    Lal D, May P, Samocha KE, Kosmicki JA, Robinson EB, MØller RS, et al. Gene family information facilitates variant interpretation and identification of disease-associated genes. bioRxiv 159780; https://doi.org/10.1101/159780

  7. 7.

    Middelkamp S, Vlaar JM, Giltay J, Korzelius J, Besselink N, Boymans S, et al. Prioritization of genes driving congenital phenotypes of patients with de novo structural variants. Genome Med. 2019. https://doi.org/10.1186/s13073-019-0692-0

  8. 8.

    Li MM, Datto M, Duncavage EJ, Kulkarni S, Lindeman NI, Roy S, et al. Standards and guidelines for the interpretation and reporting of sequence variants in cancer: a joint consensus recommendation of the Association for Molecular Pathology, American Society of Clinical Oncology, and College of American Pathologists. J Mol Diagn. 2017;19:4–23.

    • CAS
    • Article
    • Google Scholar
  9. 9.

    Lever J, Jones MR, Danos AM, Krysiak K, Bonakdar M, Grewal J, et al. Text-mining clinically relevant cancer biomarkers for curation into the CIViC database. Genome Med. 2019. https://doi.org/10.1186/s13073-019-0686-y

  10. 10.

    He MM, Li Q, Yan M, Cao H, Hu Y, He KY, et al. Variant interpretation for Cancer (VIC): a computational tool for assessing clinical impacts of somatic variants. Genome Med. 2019;11:53.

    • Article
    • Google Scholar
  11. 11.

    Danos et al. Standard operating procedure for curation and clinical interpretation of variants in cancer. Genome Med. 2019;11:76. https://doi.org/10.1186/s13073-019-0687-x

Download references

We thank all of the authors who submitted manuscripts for this special issue of Genome Medicine.

Funding

HLR was supported by the National Human Genome Research Institute of the National Institutes of Health (NIH) under award numbers UM1HG008900, U01HG008676, and U41HG006834. DMF was supported by the National Human Genome Research Institute of the NIH under award number RM1HG010461. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

Affiliations

  1. Center for Genomic Medicine, Massachusetts General Hospital, Cambridge Street, Boston, MA, 02114, USA
    • Heidi L. Rehm
  2. Medical and Population Genetics, Broad Institute of MIT and Harvard, Main Street, Cambridge, MA, 02142, USA
    • Heidi L. Rehm
  3. Department of Pathology, Harvard Medical School, Shattuck Street, Boston, MA, 02115, USA
    • Heidi L. Rehm
  4. Department of Genome Sciences, University of Washington, 15th Avenue NE, Seattle, WA, 98195-5065, USA
    • Douglas M. Fowler
  5. Canadian Institute for Advanced Research, University Avenue, Toronto, ON, M5G 1M1, Canada
    • Douglas M. Fowler
  6. Department of Bioengineering, University of Washington, 15th Avenue NE, Seattle, WA, 98195-5061, USA
    • Douglas M. Fowler
Authors
  1. Search for Heidi L. Rehm in:
    • PubMed
    • Google Scholar
  2. Search for Douglas M. Fowler in:
    • PubMed
    • Google Scholar

Contributions

Both authors drafted and edited the manuscript and also approved the final version.

Corresponding authors

Correspondence to Heidi L. Rehm or Douglas M. Fowler.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

Verify currency and authenticity via CrossMark

Cite this article

Rehm, H.L., Fowler, D.M. Keeping up with the genomes: scaling genomic variant interpretation. Genome Med 12, 5 (2020) doi:10.1186/s13073-019-0700-4

Download citation

  • Published

  • DOI

    https://doi.org/10.1186/s13073-019-0700-4



中文翻译:


跟上基因组的步伐:扩展基因组变异解释。



在过去 10 年中,我们在读取人类基因组 DNA 和检测变异的能力方面取得了重大进步。我们发现的变异有可能改善人类疾病的诊断和治疗,并确定我们独特的特征。尽管追赶速度较慢,但​​我们现在看到用于解释编码和非编码区域中这些变异的策略同样快速进步。在测序技术、检测所有临床显着变异的管道以及包含最有效的变异解释方法的分析工具方面建立强大的基础设施,对于在患者护理方面取得广泛且有意义的进步以及确保基因组技术在医疗保健中的准确和信息应用。


已经开发出许多平台来检测种系中以及体细胞癌和嵌合体中不同类型的 DNA 变异。例如,短读长、下一代测序通常用于检测短序列变异,而桑格测序仍用于确认许多变异。核型分析和染色体微阵列是常用于检测结构变异的平台。此外,还有无数其他平台和检测方法用于检测部分基因缺失和重复、常见易位、重复扩增和基因扩增,并辨别同源区域的变异。然而,维护如此多的平台来检测大量的人类变异对于实验室、临床医生和患者来说是复杂、昂贵且难以驾驭的。


在本期《基因组医学》特刊中,Lindstrand 及其同事 [1] 展示了全基因组测序的能力,可以将许多此类平台整合为单一方法,以检测各种人类变异类型。下一步将是使准确识别和注释不同类型变异所需的计算工具民主化,以便每个能够生成完整人类基因组序列的实验室都能够对所有类型的人类基因组变异进行高度灵敏和特异性的检测具有临床后果。


尽管检测人类遗传变异是必要的第一步,但需要许多资源来支持对已识别变异的准确解释。人类群体在遗传上是多样化的,无论是在良性变异范围还是与疾病相关的变异方面。在本期中,Abul-Husn 及其同事 [2] 报告称,与欧洲人群相比,非欧洲人群中不确定意义变异的比例有所增加,特别是在非洲血统比例较高的人群中。这种意义不确定的变异负担是由于缺乏对代表性不足的人群的招募造成的,这导致了对这些人群中疾病因果关系的了解匮乏。因此,需要在疾病研究中对受影响的个体进行不同的群体研究,以建立所有人群的遗传病病因学知识,并确保所有个体都能从基因组医学中公平受益。 Abul-Husn 及其同事报告的研究结果 [2] 还强调,跨地理人群的人类遗传变异的庞大且多样化的目录对于排除在一个人群中罕见但在另一人群中常见的变异引起疾病的可能性至关重要。


对于变异解释也至关重要的是评估功能测定多样性的严格方法,这些方法用于辨别哪些变异会破坏基因产物的功能,哪些不会。这项任务很困难,因为大多数基因产物具有过多的功能,有时在不同的细胞类型中甚至在有机体环境中。在本期特刊中,Brnich 及其同事 [3] 提出了一种严格的策略,以确保功能分析在生成的数据应用于变异的常规临床解释之前得到充分验证。这些建议是为了在 ACMG/AMP 变异解释框架内评估和应用功能性证据而制定的[4],并且是减少证据代码应用中不一致的关键一步。


此外,一旦功能测定得到验证,就可以对其进行多重分析,以全面评估一类或多类变异的影响,从而实现简化和准确的遗传解释。多重功能测定对于评估难以解释的变异类别特别有用,例如错义和剪接位点变异。尽管多重功能测定很有前景,但它给开发它们的研究人员和使用它们产生的功能数据的临床医生带来了一系列独特的挑战。因此,Gelman 及其同事 [5] 就多重功能测定的开发人员应如何评估测定性能和报告测定结果提出了建议。他们还为临床医生提供有关如何评估大规模功能数据集的质量和临床效用以及如何将这些数据纳入常规变异解释的指导。


识别罕见疾病遗传原因的传统方法不断产生新的基因发现,包括汇总具有共同破坏的候选基因的极其罕见、高渗透表型的病例。然而,其他人类疾病更难解决,因为它们是由非特异性表型定义的,或者是因为它们是由多个位点的变异引起的。例子包括自闭症和先天性心脏病。然而,由于能够对疾病和对照个体群体进行大规模测序,包括能够检测从头变异的三组,统计框架现在能够以更高的精度突出显示候选疾病位点。 Lal 及其同事描述了将从头负荷分析与旁系同源基因分组相结合,以鉴定 28 个神经发育障碍的强候选基因。值得注意的是,这些候选者在大脑中表达并表现出进化限制[6]。另一个挑战是平衡结构变异的解释,其中致病性的可能驱动因素很难识别。 Middelkamp 及其同事使用实验和计算相结合的方法来检查直接破坏和间接染色质介导的影响[7],优先考虑先前无法解释的从头结构变异的因果基因,这些变异是在先天性异常或智力障碍的背景下发现的。总之,通过数据共享计划和创新分析方法的应用,大规模聚集表型良好的疾病个体,我们最终将全面了解导致人类疾病的基因和基因组区域。


系统指导 [4] 和 ClinVar 中变异解释的常规共享极大地帮助了罕见疾病遗传变异的解释。最近,发布了指南,为体细胞变异的解释提供初步指导,同时考虑到临床相关多个维度的复杂性,包括诊断、预后和药物反应性[8]。这些指南更好地使癌症社区能够标准化癌症变异评估并建立共享的社区资源。这些改进至关重要,因为它们可以促进基因检测在癌症中的快速增长应用,其结果对于准确的预后和治疗指导至关重要。在本期中,Lever 及其同事 [9] 演示了一种文本挖掘方法,用于从数千种生物标记物的文献中收集数据,并将信息存储在名为 CIViCmine 的可公开访问的数据库中。他和同事 [10] 应用计算方法来使用预先注释的文件并应用临床评估标准。这两种方法都可以对肿瘤中发现的变异进行优先排序,以供进一步审查。此外,Danos 和同事 [11] 描述了 CIViC 的改进,CIViC 是一个用于体细胞变异社区管理的开放平台。这些改进包括通用数据模型和标准操作程序,旨在支持对癌症变异的一致和准确的解释。


随着基因组医学成功案例的不断出现,我们将面临越来越多的基因组需要分析和遗传变异需要解释。由于人类基因组的复杂性及其变异的多样性,以及收集足够的数据来解释变异的挑战,这两项任务都很困难。本期特刊介绍了变异检测、实验规模、计算方法的改进以及有助于应对这些挑战的社区资源建设方面的一些进展。尽管这一进展令人鼓舞,但还需要做更多的工作。例如,我们必须开发一种廉价、广泛部署的管道来组装全基因组序列和检测变异。我们必须将这样的管道大规模应用于不同的人群,以便了解常见遗传变异的真实程度。我们必须部署多重功能测定来量化许多(如果不是大多数)疾病相关位点的变异影响。最后,我们必须通过采用一套连贯的标准和严格的数据共享文化来整合这些资源。如果成功,我们将使所有个人都能从基因组学在疾病诊断和基因组疾病预防中的常规应用中受益。

  1. 1.


    Lindstrand A、Eisfeldt J、Pettersson M、Carvalho CMB、Kvarnung M、Grigelioniene G 等。从细胞遗传学到细胞基因组学:全基因组测序作为一线测试,全面捕获了导致智力障碍的多种致病遗传变异。基因组医学。 2019;11:68。

    •  文章
    •  谷歌学术
  2. 2.

  3. 3.

  4. 4.


    理查兹 S、阿齐兹 N、贝尔 S、比克 D、达斯 S、加斯提尔-福斯特 J 等。序列变异解释的标准和指南:美国医学遗传学和基因组学学院和分子病理学协会的联合共识建议。基因医学。 2015;17:405–24。

    •  文章
    •  谷歌学术
  5. 5.

  6. 6.


    Lal D、May P、Samocha KE、Kosmicki JA、Robinson EB、MØller RS 等。基因家族信息有助于疾病相关基因的变异解释和识别。生物Rxiv 159780; https://doi.org/10.1101/159780

  7. 7.

  8. 8.


    Li MM、Datto M、Duncavage EJ、Kulkarni S、Lindeman NI、Roy S 等。癌症序列变异的解释和报告的标准和指南:分子病理学协会、美国临床肿瘤学会和美国病理学家学会的联合共识建议。分子诊断杂志。 2017;19:4–23。

    •  中科院
    •  文章
    •  谷歌学术
  9. 9.

  10. 10.


    何MM,李Q,严明,曹华,胡Y,何KY,等。癌症变异解释(VIC):一种用于评估体细胞变异临床影响的计算工具。基因组医学。 2019;11:53。

    •  文章
    •  谷歌学术
  11. 11.

    Danos et al. Standard operating procedure for curation and clinical interpretation of variants in cancer. Genome Med. 2019;11:76. https://doi.org/10.1186/s13073-019-0687-x

 下载参考资料


我们感谢为本期《基因组医学》特刊提交稿件的所有作者。

 资金


HLR 得到了美国国立卫生研究院 (NIH) 国家人类基因组研究所的支持,奖项编号为 UM1HG008900、U01HG008676 和 U41HG006834。 DMF 得到了 NIH 国家人类基因组研究所的支持,奖项编号为 RM1HG010461。内容完全由作者负责,并不一定代表 NIH 的官方观点。

 隶属关系


  1. 基因组医学中心,马萨诸塞州总医院,剑桥街,波士顿,马萨诸塞州,02114,美国
    •  海蒂·雷姆

  2. 医学和人口遗传学,麻省理工学院和哈佛大学布罗德研究所,Main Street,Cambridge,MA,02142,美国
    •  海蒂·雷姆

  3. 哈佛医学院病理学系,Shattuck Street,波士顿,MA,02115,美国
    •  海蒂·雷姆

  4. 华盛顿大学基因组科学系,15th Avenue NE, Seattle, WA, 98195-5065, USA
    •  道格拉斯·M·福勒

  5. 加拿大高级研究所,大学大道,多伦多,ON,M5G 1M1,加拿大
    •  道格拉斯·M·福勒

  6. 华盛顿大学生物工程系,15th Avenue NE, Seattle, WA, 98195-5061, USA
    •  道格拉斯·M·福勒
 作者

  1. 在以下位置搜索 Heidi L. Rehm:

    • 考研医学
    •  谷歌学术

  2. 在以下位置搜索 Douglas M. Fowler:

    • 考研医学
    •  谷歌学术

 贡献


两位作者起草和编辑了手稿,并批准了最终版本。

 通讯作者


通讯作者:Heidi L. Rehm 或 Douglas M. Fowler。

 利益争夺


作者声明他们没有利益冲突。

 出版商备注


施普林格·自然对于已出版的地图和机构隶属关系中的管辖权主张保持中立。


开放获取本文根据知识共享署名 4.0 国际许可证 (http://creativecommons.org/licenses/by/4.0/) 的条款分发,该许可证允许在任何媒体上不受限制地使用、分发和复制,前提是您提供适当注明原作者和来源,提供知识共享许可证的链接,并注明是否进行了更改。除非另有说明,知识共享公共领域奉献豁免 (http://creativecommons.org/publicdomain/zero/1.0/) 适用于本文中提供的数据。

 转载和许可

Verify currency and authenticity via CrossMark

 引用这篇文章


Rehm, HL, Fowler, DM 跟上基因组的步伐:扩展基因组变异解释。基因组医学12, 5 (2020) doi:10.1186/s13073-019-0700-4

 下载引文

  •  已发表

  • DOI

    https://doi.org/10.1186/s13073-019-0700-4

更新日期:2020-04-22
down
wechat
bug