当前位置: X-MOL 学术bioRxiv. Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Insights into dispersed duplications and complex structural mutations from whole genome sequencing 706 families
bioRxiv - Genomics Pub Date : 2020-08-10 , DOI: 10.1101/2020.08.03.235358
Christopher W. Whelan , Robert E. Handsaker , Giulio Genovese , Seva Kashin , Monkol Lek , Jason Hughes , Joshua McElwee , Michael Lenardo , Daniel MacArthur , Steven A. McCarroll

Two intriguing forms of genome structural variation (SV) - dispersed duplications, and de novo rearrangements of complex, multi-allelic loci - have long escaped genomic analysis. We describe a new way to find and characterize such variation by utilizing identity-by-descent (IBD) relationships between siblings together with high-precision measurements of segmental copy number. Analyzing whole-genome sequence data from 706 families, we find hundreds of "IBD-discordant" (IBDD) CNVs: loci at which siblings' CNV measurements and IBD states are mathematically inconsistent. We found that commonly-IBDD CNVs identify dispersed duplications; we mapped 95 of these common dispersed duplications to their true genomic locations through family-based linkage and population linkage disequilibrium (LD), and found several to be in strong LD with genome-wide association (GWAS) signals for common diseases or gene expression variation at their revealed genomic locations. Other CNVs that were IBDD in a single family appear to involve de novo mutations in complex and multi-allelic loci; we identified 26 de novo structural mutations that had not been previously detected in earlier analyses of the same families by diverse SV analysis methods. These included a de novo mutation of the amylase gene locus and multiple de novo mutations at chromosome 15q14. Combining these complex mutations with more-conventional CNVs, we estimate that segmental mutations larger than 1kb arise in about one per 22 human meioses. These methods are complementary to previous techniques in that they interrogate genomic regions that are home to segmental duplication, high CNV allele frequencies, and multi-allelic CNVs.

中文翻译:

从706个全基因组测序中了解分散的重复和复杂的结构突变

基因组结构变异(SV)的两种有趣形式-分散的重复和复杂的多等位基因座的从头重排-长期以来逃脱了基因组分析。我们描述了一种通过利用兄弟姐妹之间的后裔身份(IBD)关系以及分段副本数的高精度测量来发现和表征这种变异的新方法。分析来自706个家庭的全基因组序列数据,我们发现数百个“ IBD不协调”(IBDD)CNV:在兄弟姐妹处CNV测量值和IBD状态在数学上不一致的基因座。我们发现,通常的IBDD CNV可以识别分散的重复。我们通过基于家庭的连锁和群体连锁不平衡(LD)将这些常见的分散重复中的95个映射到其真正的基因组位置,并发现其中一些与常见疾病或基因组表达区域中基因表达变异的全基因组关联(GWAS)信号强相关。单个家族中其他IBDD的CNV似乎涉及复杂和多等位基因座的从头突变。我们鉴定了26种从头开始的结构突变,这些突变是以前通过多种SV分析方法在相同家族的早期分析中未发现的。这些包括淀粉酶基因基因座的从头突变和15q14染色体上的多个从头突变。将这些复杂的突变与更常规的CNV结合起来,我们估计每22个人中约有1个个体发生了大于1kb的区段突变。这些方法是对先前技术的补充,因为它们可以查询片段重复所在的基因组区域,
更新日期:2020-08-11
down
wechat
bug