当前位置: X-MOL 学术Nat. Biotechnol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A robust benchmark for detection of germline large deletions and insertions.
Nature Biotechnology ( IF 46.9 ) Pub Date : 2020-06-15 , DOI: 10.1038/s41587-020-0538-8
Justin M Zook 1 , Nancy F Hansen 2 , Nathan D Olson 1 , Lesley Chapman 1 , James C Mullikin 2 , Chunlin Xiao 3 , Stephen Sherry 3 , Sergey Koren 2 , Adam M Phillippy 2 , Paul C Boutros 4 , Sayed Mohammad E Sahraeian 5 , Vincent Huang 6 , Alexandre Rouette 7 , Noah Alexander 8 , Christopher E Mason 9, 10, 11, 12 , Iman Hajirasouliha 9 , Camir Ricketts 9 , Joyce Lee 13 , Rick Tearle 14 , Ian T Fiddes 15 , Alvaro Martinez Barrio 15 , Jeremiah Wala 16 , Andrew Carroll 17 , Noushin Ghaffari 18 , Oscar L Rodriguez 19 , Ali Bashir 19 , Shaun Jackman 20 , John J Farrell 21 , Aaron M Wenger 22 , Can Alkan 23 , Arda Soylev 24 , Michael C Schatz 25 , Shilpa Garg 26 , George Church 26 , Tobias Marschall 27 , Ken Chen 28 , Xian Fan 29 , Adam C English 30 , Jeffrey A Rosenfeld 31, 32 , Weichen Zhou 33 , Ryan E Mills 33 , Jay M Sage 34 , Jennifer R Davis 34 , Michael D Kaiser 34 , John S Oliver 34 , Anthony P Catalano 34 , Mark J P Chaisson 35 , Noah Spies 36 , Fritz J Sedlazeck 37 , Marc Salit 36
Affiliation  

New technologies and analysis methods are enabling genomic structural variants (SVs) to be detected with ever-increasing accuracy, resolution and comprehensiveness. To help translate these methods to routine research and clinical practice, we developed a sequence-resolved benchmark set for identification of both false-negative and false-positive germline large insertions and deletions. To create this benchmark for a broadly consented son in a Personal Genome Project trio with broadly available cells and DNA, the Genome in a Bottle Consortium integrated 19 sequence-resolved variant calling methods from diverse technologies. The final benchmark set contains 12,745 isolated, sequence-resolved insertion (7,281) and deletion (5,464) calls ≥50 base pairs (bp). The Tier 1 benchmark regions, for which any extra calls are putative false positives, cover 2.51 Gbp and 5,262 insertions and 4,095 deletions supported by ≥1 diploid assembly. We demonstrate that the benchmark set reliably identifies false negatives and false positives in high-quality SV callsets from short-, linked- and long-read sequencing and optical mapping.



中文翻译:

检测种系大缺失和插入的强大基准。

新技术和分析方法使检测基因组结构变异 (SV) 的准确性、分辨率和全面性不断提高。为了帮助将这些方法转化为常规研究和临床实践,我们开发了一个序列解析基准集,用于识别假阴性和假阳性种系大插入和缺失。为了为具有广泛可用细胞和 DNA 的个人基因组计划三重奏中的一个广泛同意的儿子创建这个基准,瓶中基因组联盟整合了来自不同技术的 19 种序列解析变异调用方法。最终的基准集包含 12,745 个分离的、序列解析的插入 (7,281) 和删除 (5,464) 调用≥50 个碱基对 (bp)。第 1 层基准区域,对于这些区域,任何额外的调用都是假定的误报,涵盖 2。≥1 个二倍体组装支持 51 Gbp 和 5,262 个插入和 4,095 个缺失。我们证明了基准集能够可靠地识别来自短读长、链接读长和长读长测序和光学映射的高质量 SV 调用集中的假阴性和假阳性。

更新日期:2020-06-15
down
wechat
bug