当前位置: X-MOL 学术Nucleic Acids Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A survey of localized sequence rearrangements in human DNA
Nucleic Acids Research ( IF 16.6 ) Pub Date : 2017-12-19 , DOI: 10.1093/nar/gkx1266
Martin C Frith 1, 2, 3 , Sofia Khan 3
Affiliation  

Genomes mutate and evolve in ways simple (substitution or deletion of bases) and complex (e.g. chromosome shattering). We do not fully understand what types of complex mutation occur, and we cannot routinely characterize arbitrarily-complex mutations in a high-throughput, genome-wide manner. Long-read DNA sequencing methods (e.g. PacBio, nanopore) are promising for this task, because one read may encompass a whole complex mutation. We describe an analysis pipeline to characterize arbitrarily-complex ‘local’ mutations, i.e. intrachromosomal mutations encompassed by one DNA read. We apply it to nanopore and PacBio reads from one human cell line (NA12878), and survey sequence rearrangements, both real and artifactual. Almost all the real rearrangements belong to recurring patterns or motifs: the most common is tandem multiplication (e.g. heptuplication), but there are also complex patterns such as localized shattering, which resembles DNA damage by radiation. Gene conversions are identified, including one between hemoglobin gamma genes. This study demonstrates a way to find intricate rearrangements with any number of duplications, deletions, and repositionings. It demonstrates a probability-based method to resolve ambiguous rearrangements involving highly similar sequences, as occurs in gene conversion. We present a catalog of local rearrangements in one human cell line, and show which rearrangement patterns occur.

中文翻译:

人类DNA局部序列重排的研究

基因组的变异和进化以简单(取代或缺失碱基)和复杂(例如染色体破碎)的方式发生。我们尚不完全了解发生何种类型的复杂突变,也无法以高通量,全基因组的方式常规表征任意复杂突变。长读DNA测序方法(例如PacBio,nanopore)有望完成此任务,因为一次读可能包含完整的复杂突变。我们描述了一个分析流水线,以表征任意复杂的“局部”突变,即一个DNA读取所包含的染色体内突变。我们将其应用于纳米孔和PacBio从一种人类细胞系(NA12878)读取的数据,并调查真实和人为的序列重排。几乎所有真正的重排都属于重复出现的图案或图案:最常见的是串联乘法(例如 七倍体),但也有复杂的模式,例如局部破碎,类似于辐射对DNA的破坏。确定了基因转换,包括血红蛋白γ基因之间的转换。这项研究演示了一种方法,该方法可以查找具有任意数量的重复,删除和重新定位的复杂重排。它展示了一种基于概率的方法来解决涉及高度相似序列的模棱两可的重排,如基因转换中所发生的那样。我们提出了一个人类细胞系中的局部重排的目录,并显示出发生了哪些重排模式。这项研究演示了一种方法,该方法可以查找具有任意数量的重复,删除和重新定位的复杂重排。它展示了一种基于概率的方法来解决涉及高度相似序列的模棱两可的重排,就像在基因转换中一样。我们提出了一个人类细胞系中的局部重排的目录,并显示出发生了哪些重排模式。这项研究演示了一种方法,该方法可以查找具有任意数量的重复,删除和重新定位的复杂重排。它展示了一种基于概率的方法来解决涉及高度相似序列的模棱两可的重排,如基因转换中所发生的那样。我们提出了一个人类细胞系中的局部重排的目录,并显示出发生了哪些重排模式。
更新日期:2017-12-19
down
wechat
bug