当前位置: X-MOL 学术Hum. Genet. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Interplay between probe design and test performance: overlap between genomic regions of interest, capture regions and high quality reference calls influence performance of WES-based assays.
Human Genetics ( IF 5.3 ) Pub Date : 2020-07-05 , DOI: 10.1007/s00439-020-02201-y
Erinija Pranckeviciene 1, 2 , Lemuel Racacho 1, 3 , Mahdi Ghani 1 , Landry Nfonsam 1 , Ryan Potter 1 , Elizabeth Sinclair-Bourque 1 , Gabrielle Mettler 1 , Amanda Smith 1, 4 , Lucas Bronicki 1, 4 , Lijia Huang 1, 4 , Olga Jarinova 1, 4
Affiliation  

Whole exome sequencing (WES)-based assays undergo rigorous validation before being implemented in diagnostic laboratories. This validation process generates experimental evidence that allows laboratories to predict the performance of the intended assay. The NA12878 Genome in a Bottle (GIAB) HapMap reference sample is commonly used for validation in diagnostic laboratories. We investigated what data points should be taken into consideration when validating WES-based assays using the GIAB reference in a diagnostic setting. We delineate specific factors that require special consideration and identify OMIM genes associated with diseases that may ‘bypass’ validation. Four replicates of the NA12878 sample were sequenced at the CHEO Genetics Diagnostic Laboratory on a NextSeq 500; the data were analyzed using the bcbio_nexgen v1.1.2 pipeline. The hap.py validation engine, Real Time Genomics vcfeval tool, and high confidence (HC) variant calls in HC regions available for the GIAB sample were used to validate the obtained variant calls. The same validation process was then used to evaluate variant calls obtained for the same sample by two other clinical diagnostic laboratories. We showed that variant calls in NA12878 can be confidently measured only in the regions that intersect between the GIAB HC regions and the target regions of exome capture. Of the 4139 (as of October 2019) OMIM genes associated with a phenotype and having a known molecular basis of disease, 84 were fully outside of the GIAB HC regions and many of the remaining OMIM genes were only partially covered by the HC regions. A significant proportion of variants identified in the NA12878 sample outside of the HC regions have unknown (UNK) status due to the absence of HC reference alleles. Verification of such calls is possible either by an alternative truth set or by orthogonal testing. Similarly, many variants outside of exome capture regions, if not accounted for, will be deemed false negatives due to insufficient probe coverage. Our results demonstrate the importance of the intersection between genomic regions of interest, capture regions, and the high confidence regions. If not considered, false and ambiguous variant calls could have a negative impact on diagnostic accuracy of the intended WES-based diagnostic assay and increase the need for confirmatory testing. To enable laboratories to identify ‘problematic’ regions and optimize validation efforts, we have made our VCF and BED files available in UCSC Genome Browser: NA12878 WES Benchmark. Relevant genes and genome annotations are evolving, we implemented a general purpose algorithm to cross-reference OMIM genes with the genomic regions of interest that can be applied to capture genes/regions outside HC regions (see repository of data material section).



中文翻译:

探针设计和测试性能之间的相互作用:目标基因组区域,捕获区域和高质量参考调用之间的重叠会影响基于WES的测定的性能。

基于全外显子组测序(WES)的测定法必须经过严格的验证,然后才能在诊断实验室中实施。该验证过程会产生实验证据,使实验室能够预测预期分析的性能。瓶中NA12878基因组(GIAB)HapMap参考样品通常用于诊断实验室的验证。我们调查了在诊断设置中使用GIAB参考验证基于WES的测定时应考虑哪些数据点。我们描述了需要特别考虑的特定因素,并确定与可能“绕过”验证的疾病相关的OMIM基因。将NA12878样品的四份重复样品在CHEO遗传学诊断实验室的NextSeq 500上进行测序。使用bcbio_nexgen v1.1.2管道分析了数据。好的 py验证引擎,实时基因组学vcfeval工具以及可用于GIAB样本的HC区域中的高置信度(HC)变异调用用于验证获得的变异调用。然后,使用相同的验证过程来评估另外两个临床诊断实验室对同一样品获得的变异调用。我们表明,只有在GIAB HC区和外显子组捕获的目标区之间相交的区域中,才能可靠地测量NA12878中的变异呼叫。在4139个(截至2019年10月)与表型相关并具有已知疾病分子基础的OMIM基因中,有84个完全位于GIAB HC区之外,许多剩余的OMIM基因仅部分被HC区覆盖。由于缺乏HC参考等位基因,NA12878样品中在HC区域以外发现的很大一部分变体具有未知(UNK)状态。可以通过备用真值集或通过正交测试来验证此类调用。同样,如果不考虑外显子捕获区域以外的许多变异,由于探针覆盖范围不足,将被视为假阴性。我们的结果证明了感兴趣的基因组区域,捕获区域和高置信度区域之间相交的重要性。如果不考虑,错误和模棱两可的变量调用可能会对预期的基于WES的诊断分析的诊断准确性产生负面影响,并增加对确认测试的需求。为了使实验室能够识别“问题”区域并优化验证工作,我们已经在UCSC基因组浏览器NA12878 WES Benchmark中提供了VCF和BED文件。相关基因和基因组注释正在不断发展,我们实施了一种通用算法来交叉引用OMIM基因与感兴趣的基因组区域,这些基因组区域可用于捕获HC区域以外的基因/区域(请参见数据资料部分的存储库)。

更新日期:2020-07-05
down
wechat
bug