当前位置: X-MOL 学术BMC Med. Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Accuracy and reproducibility of somatic point mutation calling in clinical-type targeted sequencing data
BMC Medical Genomics ( IF 2.7 ) Pub Date : 2020-10-15 , DOI: 10.1186/s12920-020-00803-z
Ali Karimnezhad , Gareth A. Palidwor , Kednapa Thavorn , David J. Stewart , Pearl A. Campbell , Bryan Lo , Theodore J. Perkins

Treating cancer depends in part on identifying the mutations driving each patient’s disease. Many clinical laboratories are adopting high-throughput sequencing for assaying patients’ tumours, applying targeted panels to formalin-fixed paraffin-embedded tumour tissues to detect clinically-relevant mutations. While there have been some benchmarking and best practices studies of this scenario, much variant calling work focuses on whole-genome or whole-exome studies, with fresh or fresh-frozen tissue. Thus, definitive guidance on best choices for sequencing platforms, sequencing strategies, and variant calling for clinical variant detection is still being developed. Because ground truth for clinical specimens is rarely known, we used the well-characterized Coriell cell lines GM12878 and GM12877 to generate data. We prepared samples to mimic as closely as possible clinical biopsies, including formalin fixation and paraffin embedding. We evaluated two well-known targeted sequencing panels, Illumina’s TruSight 170 hybrid-capture panel and the amplification-based Oncomine Focus panel. Sequencing was performed on an Illumina NextSeq500 and an Ion Torrent PGM respectively. We performed multiple replicates of each assay, to test reproducibility. Finally, we applied four different freely-available somatic single-nucleotide variant (SNV) callers to the data, along with the vendor-recommended callers for each sequencing platform. We did not observe major differences in variant calling success within the regions that each panel covers, but there were substantial differences between callers. All had high sensitivity for true SNVs, but numerous and non-overlapping false positives. Overriding certain default parameters to make them consistent between callers substantially reduced discrepancies, but still resulted in high false positive rates. Intersecting results from multiple replicates or from different variant callers eliminated most false positives, while maintaining sensitivity. Reproducibility and accuracy of targeted clinical sequencing results depend less on sequencing platform and panel than on variability between replicates and downstream bioinformatics. Differences in variant callers’ default parameters are a greater influence on algorithm disagreement than other differences between the algorithms. Contrary to typical clinical practice, we recommend employing multiple variant calling pipelines and/or analyzing replicate samples, as this greatly decreases false positive calls.

中文翻译:

体细胞点突变在临床型靶向测序数据中的准确性和可重复性

治疗癌症部分取决于确定驱动每个患者疾病的突变。许多临床实验室正在采用高通量测序来测定患者的肿瘤,将靶向的标本应用于福尔马林固定石蜡包埋的肿瘤组织,以检测与临床相关的突变。尽管已经对此情况进行了一些基准测试和最佳实践研究,但许多变异调用工作都集中在使用新鲜或新鲜冷冻组织的全基因组或全外显子组研究上。因此,关于测序平台,测序策略和要求临床变异检测的变异的最佳选择的明确指南仍在制定中。由于临床标本的地面真相鲜为人知,因此我们使用了特征明确的Coriell细胞系GM12878和GM12877来生成数据。我们准备了尽可能模拟临床活检样本的样本,包括福尔马林固定和石蜡包埋。我们评估了两个著名的靶向测序板,Illumina的TruSight 170混合捕获板和基于扩增的Oncomine Focus板。分别在Illumina NextSeq500和Ion Torrent PGM上进行测序。我们对每个测定进行了多次重复,以测试可重复性。最后,我们对数据应用了四个不同的可免费获得的体细胞单核苷酸变异体(SNV)调用者,以及每个测序平台的供应商推荐的调用者。我们没有观察到每个小组所涵盖的区域内在变式呼叫成功方面的主要差异,但是在呼叫者之间存在实质性差异。所有人都对真正的SNV敏感,但是有很多且不重叠的误报。覆盖某些默认参数以使它们在调用方之间保持一致,从而大大降低了差异,但仍导致较高的误报率。来自多个重复样本或来自不同变异调用者的相交结果消除了大多数误报,同时保持了敏感性。靶向临床测序结果的可重复性和准确性对测序平台和检测小组的依赖性较小,而与复制品和下游生物信息学之间的差异无关。与算法之间的其他差异相比,变量调用者默认参数的差异对算法分歧的影响更大。与典型的临床实践相反,我们建议使用多个变体调用管道和/或分析重复样本,
更新日期:2020-10-16
down
wechat
bug