Abstract
Structural variant (SV) differences between human genomes can cause germline and mosaic disease as well as inter-individual variation. De-regulation of accurate DNA repair and genomic surveillance mechanisms results in a large number of SVs in cancer. Analysis of the DNA sequences at SV breakpoints can help identify pathways of mutagenesis and regions of the genome that are more susceptible to rearrangement. Large-scale SV analyses have been enabled by high-throughput genome-level sequencing on humans in the past decade. These studies have shed light on the mechanisms and prevalence of complex genomic rearrangements. Recent advancements in both sequencing and other mapping technologies as well as calling algorithms for detection of genomic rearrangements have helped propel SV detection into population-scale studies, and have begun to elucidate previously inaccessible regions of the genome. Here, we discuss the genomic organization of simple and complex SVs, the molecular mechanisms of their formation, and various ways to detect them. We also introduce methods for characterizing SVs and their consequences on human genomes.
Similar content being viewed by others
Abbreviations
- GRCh38:
-
Genome Reference Consortium Human Build 38
- HGR:
-
human genome reference
- HTS:
-
high-throughput sequencing
- WGS:
-
whole-genome sequencing
- SNV:
-
single-nucleotide variant (1 bp)
- InDel:
-
insertion and deletion (1 – 49 bp)
- SV:
-
structural variant (≥ 50 bp)
- DSB:
-
double-strand break
- CNV:
-
copy number variant
- bp:
-
base pair
- DEL:
-
deletion
- INS:
-
insertion
- MEI:
-
mobile element insertion
- DUP:
-
duplication
- TRP:
-
triplication
- INV:
-
inversion
- TRA:
-
translocation
- CGR:
-
complex genomic rearrangement
- LOH:
-
loss of heterozygosity
- SNP:
-
single-nucleotide polymorphism
- NHEJ:
-
non-homologous end joining
- NAHR:
-
non-allelic homologous recombination
- FoSTeS:
-
fork stalling and template switching
- MMBIR:
-
microhomology-mediated break-induced replication
- SSA:
-
single-strand annealing
- FISH:
-
fluorescence in situ hybridization
- aCGH:
-
array comparative genomic hybridization
- SCE:
-
sister chromatid exchange
- PacBio:
-
Pacific Biosciences
- CCS:
-
circular concensus sequencing
- CLR:
-
continuous long read
- ONT:
-
Oxford nanopore technologies
- SMRT:
-
single-molecule real time
- BrdU:
-
bromodeoxyuridine
- GEM:
-
gel-bead in emulsion
- IGV:
-
Integrative Genomics Viewer
References
1000 Genomes Project Consortium et al (2010) A map of human genome variation from population-scale sequencing. Nature 467:1061–1073. https://doi.org/10.1038/nature09534
1000 Genomes Project Consortium et al (2015) A global reference for human genetic variation. Nature 526:68–74. https://doi.org/10.1038/nature15393
Abyzov A, Urban AE, Snyder M, Gerstein M (2011) CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res 21:974–984. https://doi.org/10.1101/gr.114876.110
Amarasinghe KC, Li J, Hunter SM, Ryland GL, Cowin PA, Campbell IG, Halgamuge SK (2014) Inferring copy number and genotype in tumour exome data. BMC Genomics 15:732. https://doi.org/10.1186/1471-2164-15-732
Audano PA et al (2019) Characterizing the major structural variant alleles of the human genome. Cell 176:663–675 e619. https://doi.org/10.1016/j.cell.2018.12.019
Backenroth D, Homsy J, Murillo LR, Glessner J, Lin E, Brueckner M, Lifton R, Goldmuntz E, Chung WK, Shen Y (2014) CANOES: detecting rare copy number variants from whole exome sequencing data. Nucleic Acids Res 42:e97. https://doi.org/10.1093/nar/gku345
Beck CR et al (2010) LINE-1 retrotransposition activity in human genomes. Cell 141:1159–1170. https://doi.org/10.1016/j.cell.2010.05.021
Becker T et al (2018) FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods. Genome Biol 19:38. https://doi.org/10.1186/s13059-018-1404-6
Berlin K, Koren S, Chin CS, Drake JP, Landolin JM, Phillippy AM (2015) Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat Biotechnol 33:623–630. https://doi.org/10.1038/nbt.3238
Boeva V et al (2012) Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data. Bioinformatics 28:423–425. https://doi.org/10.1093/bioinformatics/btr670
Brand H, Collins RL, Hanscom C, Rosenfeld JA, Pillalamarri V, Stone MR, Kelley F, Mason T, Margolin L, Eggert S, Mitchell E, Hodge JC, Gusella JF, Sanders SJ, Talkowski ME (2015) Paired-duplication signatures mark cryptic inversions and other complex structural variation. Am J Hum Genet 97:170–176. https://doi.org/10.1016/j.ajhg.2015.05.012
Brouha B, Schustak J, Badge RM, Lutz-Prigge S, Farley AH, Moran JV, Kazazian HH Jr (2003) Hot L1s account for the bulk of retrotransposition in the human population. Proc Natl Acad Sci U S A 100:5280–5285. https://doi.org/10.1073/pnas.0831042100
Cameron DL et al (2017) GRIDSS: sensitive and specific genomic rearrangement detection using positional de Bruijn graph assembly. Genome Res 27:2050–2060. https://doi.org/10.1101/gr.222109.117
Carvalho CM, Lupski JR (2016) Mechanisms underlying structural variant formation in genomic disorders. Nat Rev Genet 17:224–238. https://doi.org/10.1038/nrg.2015.25
Carvalho CM et al (2011) Inverted genomic segments and complex triplication rearrangements are mediated by inverted repeats in the human genome. Nat Genet 43:1074–1081. https://doi.org/10.1038/ng.944
Carvalho CM et al (2013) Replicative mechanisms for CNV formation are error prone. Nat Genet 45:1319–1326. https://doi.org/10.1038/ng.2768
Carvalho CM et al (2015) Absence of heterozygosity due to template switching during replicative rearrangements. Am J Hum Genet 96:555–564. https://doi.org/10.1016/j.ajhg.2015.01.021
Caspersson T et al (1968) Chemical differentiation along metaphase chromosomes. Exp Cell Res 49:219–222. https://doi.org/10.1016/0014-4827(68)90538-7
Chaisson MJ et al (2015a) Resolving the complexity of the human genome using single-molecule sequencing. Nature 517:608–611. https://doi.org/10.1038/nature13907
Chaisson MJ, Wilson RK, Eichler EE (2015b) Genetic variation and the de novo assembly of human genomes. Nat Rev Genet 16:627–640. https://doi.org/10.1038/nrg3933
Chaisson MJP et al. (2019) Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nat Commun 10:1784. https://doi.org/10.1038/s41467-018-08148-z
Chan S, Lam E, Saghbini M, Bocklandt S, Hastie A, Cao H, Holmlin E, Borodkin M (2018) Structural variation detection and analysis using bionano optical mapping methods. Mol Biol 1833:193–203. https://doi.org/10.1007/978-1-4939-8666-8_16
Chen X et al (2016) Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics 32:1220–1222. https://doi.org/10.1093/bioinformatics/btv710
Chiang C et al (2017) The impact of structural variation on human gene expression. Nat Genet 49:692–699. https://doi.org/10.1038/ng.3834
Chiarle R et al (2011) Genome-wide translocation sequencing reveals mechanisms of chromosome breaks and rearrangements in B cells. Cell 147:107–119. https://doi.org/10.1016/j.cell.2011.07.049
Chin C-S, Khalak A (2019) Human genome assembly in 100 minutes. bioRxiv 705616. https://doi.org/10.1101/705616
Chong Z, Chen K (2018) Structural variant breakpoint detection with novoBreak methods. Mol Biol 1833:129–141. https://doi.org/10.1007/978-1-4939-8666-8_10
Conrad DF et al (2010) Origins and functional impact of copy number variation in the human genome. Nature 464:704–712. https://doi.org/10.1038/nature08516
Cooper GM, Zerr T, Kidd JM, Eichler EE, Nickerson DA (2008) Systematic assessment of copy number variant detection via genome-wide SNP genotyping. Nat Genet 40:1199–1203. https://doi.org/10.1038/ng.236
Cui C, Shu W, Li P (2016) Fluorescence in situ hybridization: cell-based genetic diagnostic and research applications. Front Cell Dev Biol 4:89. https://doi.org/10.3389/fcell.2016.00089
Deng W, Shi X, Tjian R, Lionnet T, Singer RH (2015) CASFISH: CRISPR/Cas9-mediated in situ labeling of genomic loci in fixed cells. Proc Natl Acad Sci U S A 112:11870–11875. https://doi.org/10.1073/pnas.1515692112
Eisfeldt J, Martensson G, Ameur A, Nilsson D, Lindstrand A (2019) Discovery of novel sequences in 1,000 Swedish genomes. Mol Biol Evol. https://doi.org/10.1093/molbev/msz176
English AC et al. (2015) Assessing structural variation in a personal genome-towards a human reference diploid genome. BMC Genomics 16:286. https://doi.org/10.1186/s12864-015-1479-3
Ersfeld K (2004) Fiber-FISH: fluorescence in situ hybridization on stretched DNA methods. Mol Biol 270:395–402. https://doi.org/10.1385/1-59259-793-9:395
Falconer E, Lansdorp PM (2013) Strand-seq: a unifying tool for studies of chromosome segregation. Semin Cell Dev Biol 24:643–652. https://doi.org/10.1016/j.semcdb.2013.04.005
Falconer E et al (2012) DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution. Nat Methods 9:1107–1112. https://doi.org/10.1038/nmeth.2206
Fan X, Abbott TE, Larson D, Chen K (2014) BreakDancer: identification of genomic structural variation from paired-end read mapping. Curr Protoc Bioinformatics 45:15.16.11-11. https://doi.org/10.1002/0471250953.bi1506s45
Flasch DA et al (2019) Genome-wide de novo L1 Retrotransposition connects endonuclease activity with replication. Cell 177:837–851 e828. https://doi.org/10.1016/j.cell.2019.02.050
Frock RL, Hu J, Meyers RM, Ho YJ, Kii E, Alt FW (2015) Genome-wide detection of DNA double-stranded breaks induced by engineered nucleases. Nat Biotechnol 33:179–186. https://doi.org/10.1038/nbt.3101
Fromer M et al (2012) Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth. Am J Hum Genet 91:597–607. https://doi.org/10.1016/j.ajhg.2012.08.005
Gabrieli T, Sharim H, Michaeli Y, Ebenstein Y (2017) Cas9-Assisted Targeting of CHromosome segments (CATCH) for targeted nanopore sequencing and optical genome mapping. bioRxiv 110163. https://doi.org/10.1101/110163
Gardner EJ et al (2017) The Mobile Element Locator Tool (MELT): population-scale mobile element discovery and biology. Genome Res 27:1916–1929. https://doi.org/10.1101/gr.218032.116
Gilbert N, Lutz-Prigge S, Moran JV (2002) Genomic deletions created upon LINE-1 retrotransposition. Cell 110:315–325. https://doi.org/10.1016/s0092-8674(02)00828-0
Gong L et al (2018) Picky comprehensively detects high-resolution structural variants in nanopore long reads. Nat Methods 15:455–460. https://doi.org/10.1038/s41592-018-0002-6
Goubert C, Zevallos NA, Feschotte C (2019) Contribution of unfixed transposable element insertions to human regulatory variation. bioRxiv 792937. https://doi.org/10.1101/792937
Gu S et al (2015) Alu-mediated diverse and complex pathogenic copy-number variants within human chromosome 17 at p13.3. Hum Mol Genet 24:4061–4077. https://doi.org/10.1093/hmg/ddv146
Gu S, Szafranski P, Akdemir ZC, Yuan B, Cooper ML, Magriñá MA, Bacino CA, Lalani SR, Breman AM, Smith JL, Patel A, Song RH, Bi W, Cheung SW, Carvalho CM, Stankiewicz P, Lupski JR (2016) Mechanisms for complex chromosomal insertions. PLoS Genet 12:e1006446. https://doi.org/10.1371/journal.pgen.1006446
Guan P, Sung WK (2016) Structural variation detection using next-generation sequencing data: a comparative technical review. Methods 102:36–49. https://doi.org/10.1016/j.ymeth.2016.01.020
Handsaker RE, Van Doren V, Berman JR, Genovese G, Kashin S, Boettger LM, McCarroll SA (2015) Large multiallelic copy number variations in humans. Nat Genet 47:296–303. https://doi.org/10.1038/ng.3200
Hastings PJ, Ira G, Lupski JR (2009a) A microhomology-mediated break-induced replication model for the origin of human copy number variation. PLoS Genet 5:e1000327. https://doi.org/10.1371/journal.pgen.1000327
Hastings PJ, Lupski JR, Rosenberg SM, Ira G (2009b) Mechanisms of change in gene copy number. Nat Rev Genet 10:551–564. https://doi.org/10.1038/nrg2593
Heyer EE, Deveson IW, Wooi D, Selinger CI, Lyons RJ, Hayes VM, O'Toole SA, Ballinger ML, Gill D, Thomas DM, Mercer TR, Blackburn J (2019) Diagnosis of fusion genes using targeted RNA sequencing. Nat Commun 10:1388. https://doi.org/10.1038/s41467-019-09374-9
Hills M, O'Neill K, Falconer E, Brinkman R, Lansdorp PM (2013) BAIT: organizing genomes and mapping rearrangements in single cells. Genome Med 5:82. https://doi.org/10.1186/gm486
Hindson BJ et al (2011) High-throughput droplet digital PCR system for absolute quantitation of DNA copy number. Anal Chem 83:8604–8610. https://doi.org/10.1021/ac202028g
Hoijer I et al (2018) Detailed analysis of HTT repeat elements in human blood using targeted amplification-free long-read sequencing. Hum Mutat 39:1262–1272. https://doi.org/10.1002/humu.23580
Holland AJ, Cleveland DW (2012) Chromoanagenesis and cancer: mechanisms and consequences of localized, complex chromosomal rearrangements. Nat Med 18:1630–1638. https://doi.org/10.1038/nm.2988
Hu L et al (2014) Fluorescence in situ hybridization (FISH): an increasingly demanded tool for biomarker research and personalized medicine. Biomark Res 2:3. https://doi.org/10.1186/2050-7771-2-3
Iafrate AJ et al (2004) Detection of large-scale variation in the human genome. Nat Genet 36:949–951. https://doi.org/10.1038/ng1416
Iqbal Z, Caccamo M, Turner I, Flicek P, McVean G (2012) De novo assembly and genotyping of variants using colored de Bruijn graphs. Nat Genet 44:226–232. https://doi.org/10.1038/ng.1028
Jain M et al (2018) Linear assembly of a human centromere on the Y chromosome. Nat Biotechnol 36:321–323. https://doi.org/10.1038/nbt.4109
Jeffares DC, Jolly C, Hoti M, Speed D, Shaw L, Rallis C, Balloux F, Dessimoz C, Bähler J, Sedlazeck FJ (2017) Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat Commun 8:14061. https://doi.org/10.1038/ncomms14061
Ji W, Zhang XY, Warshamana GS, Qu GZ, Ehrlich M (1994) Effect of internal direct and inverted Alu repeat sequences on PCR. PCR Methods Appl 4:109–116
Kallioniemi A, Kallioniemi OP, Sudar D, Rutovitz D, Gray JW, Waldman F, Pinkel D (1992) Comparative genomic hybridization for molecular cytogenetic analysis of solid tumors. Science 258:818–821. https://doi.org/10.1126/science.1359641
Kang SH et al (2010) Insertional translocation detected using FISH confirmation of array-comparative genomic hybridization (aCGH) results. Am J Med Genet A 152A:1111–1126. https://doi.org/10.1002/ajmg.a.33278
Kazazian HH Jr, Wong C, Youssoufian H, Scott AF, Phillips DG, Antonarakis SE (1988) Haemophilia A resulting from de novo insertion of L1 sequences represents a novel mechanism for mutation in man. Nature 332:164–166. https://doi.org/10.1038/332164a0
Kielbasa SM, Wan R, Sato K, Horton P, Frith MC (2011) Adaptive seeds tame genomic sequence comparison. Genome Res 21:487–493. https://doi.org/10.1101/gr.113985.110
Kloosterman WP et al (2011) Chromothripsis as a mechanism driving complex de novo structural rearrangements in the germline. Hum Mol Genet 20:1916–1924. https://doi.org/10.1093/hmg/ddr073
Kloosterman WP et al (2012) Constitutional chromothripsis rearrangements involve clustered double-stranded DNA breaks and nonhomologous repair mechanisms. Cell Rep 1:648–655. https://doi.org/10.1016/j.celrep.2012.05.009
Koboldt DC et al (2012) VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res 22:568–576. https://doi.org/10.1101/gr.129684.111
Kolmogorov M, Yuan J, Lin Y, Pevzner PA (2019) Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol 37:540–546. https://doi.org/10.1038/s41587-019-0072-8
Korbel JO et al (2007) Paired-end mapping reveals extensive structural variation in the human genome. Science 318:420–426. https://doi.org/10.1126/science.1149504
Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM (2017) Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 27:722–736. https://doi.org/10.1101/gr.215087.116
Kosugi S, Momozawa Y, Liu X, Terao C, Kubo M, Kamatani Y (2019) Comprehensive evaluation of structural variation detection algorithms for whole genome sequencing. Genome Biol 20:117. https://doi.org/10.1186/s13059-019-1720-5
Kraft K, Geuer S, Will AJ, Chan WL, Paliou C, Borschiwer M, Harabula I, Wittler L, Franke M, Ibrahim DM, Kragesteen BK, Spielmann M, Mundlos S, Lupiáñez DG, Andrey G (2015) Deletions, inversions, duplications: engineering of structural variants using CRISPR/Cas in mice. Cell Rep 10:833–839. https://doi.org/10.1016/j.celrep.2015.01.016
Ku CS et al (2012) Exome versus transcriptome sequencing in identifying coding region variants. Expert Rev Mol Diagn 12:241–251. https://doi.org/10.1586/erm.12.10
Lander ES et al (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921. https://doi.org/10.1038/35057062
Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359. https://doi.org/10.1038/nmeth.1923
Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25. https://doi.org/10.1186/gb-2009-10-3-r25
Layer RM, Chiang C, Quinlan AR, Hall IM (2014) LUMPY: a probabilistic framework for structural variant discovery. Genome Biol 15:R84. https://doi.org/10.1186/gb-2014-15-6-r84
Lee JA, Carvalho CM, Lupski JR (2007) A DNA replication mechanism for generating nonrecurrent rearrangements associated with genomic disorders. Cell 131:1235–1247. https://doi.org/10.1016/j.cell.2007.11.037
Li H (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv e-prints
Li H (2015) FermiKit: assembly-based variant calling for Illumina resequencing data. Bioinformatics 31:3694–3696. https://doi.org/10.1093/bioinformatics/btv440
Li H (2018) Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34:3094–3100. https://doi.org/10.1093/bioinformatics/bty191
Li J et al (2012) CONTRA: copy number analysis for targeted resequencing. Bioinformatics 28:1307–1313. https://doi.org/10.1093/bioinformatics/bts146
Linardopoulou EV, Williams EM, Fan Y, Friedman C, Young JM, Trask BJ (2005) Human subtelomeres are hot spots of interchromosomal recombination and segmental duplication. Nature 437:94–100. https://doi.org/10.1038/nature04029
Liu P et al (2011a) Chromosome catastrophes involve replication mechanisms generating complex genomic rearrangements. Cell 146:889–903. https://doi.org/10.1016/j.cell.2011.07.042
Liu P, Lacaria M, Zhang F, Withers M, Hastings PJ, Lupski JR (2011b) Frequency of nonallelic homologous recombination is correlated with length of homology: evidence that ectopic synapsis precedes ectopic crossing-over. Am J Hum Genet 89:580–588. https://doi.org/10.1016/j.ajhg.2011.09.009
Liu P, Carvalho CM, Hastings PJ, Lupski JR (2012) Mechanisms for recurrent and complex human genomic rearrangements. Curr Opin Genet Dev 22:211–220. https://doi.org/10.1016/j.gde.2012.02.012
Luo R et al (2012) SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience 1:18. https://doi.org/10.1186/2047-217X-1-18
Lupianez DG et al (2015) Disruptions of topological chromatin domains cause pathogenic rewiring of gene-enhancer interactions. Cell 161:1012–1025. https://doi.org/10.1016/j.cell.2015.04.004
Lupski JR et al (1992) Gene dosage is a mechanism for Charcot-Marie-Tooth disease type 1A. Nat Genet 1:29–33. https://doi.org/10.1038/ng0492-29
Ma C, Shao M, Kingsford C (2018) SQUID: transcriptomic structural variation detection from RNA-seq. Genome Biol 19:52. https://doi.org/10.1186/s13059-018-1421-5
Mackinnon RN, Campbell LJ (2013) Chromothripsis under the microscope: a cytogenetic perspective of two cases of AML with catastrophic chromosome rearrangement. Cancer Gene Ther 206:238–251. https://doi.org/10.1016/j.cancergen.2013.05.021
Mantere T, Kersten S, Hoischen A (2019) Long-read sequencing emerging in medical genetics. Front Genet 10:426. https://doi.org/10.3389/fgene.2019.00426
McClintock B (1950) The origin and behavior of mutable loci in maize. Proc Natl Acad Sci U S A 36:344–355. https://doi.org/10.1073/pnas.36.6.344
McTaggart AR et al (2018) Chromium sequencing: the doors open for genomics of obligate plant pathogens. Biotechniques 65:253–257. https://doi.org/10.2144/btn-2018-0019
Michaelson JJ, Sebat J (2012) forestSV: structural variant discovery through statistical learning. Nat Methods 9:819–821. https://doi.org/10.1038/nmeth.2085
Miga KH et al (2019) Telomere-to-telomere assembly of a complete human X chromosome. bioRxiv 735928. https://doi.org/10.1101/735928
Moran JV, Holmes SE, Naas TP, DeBerardinis RJ, Boeke JD, Kazazian HH Jr (1996) High frequency retrotransposition in cultured mammalian cells. Cell 87:917–927. https://doi.org/10.1016/s0092-8674(00)81998-4
Nagarajan N, Pop M (2013) Sequence assembly demystified. Nat Rev Genet 14:157–167. https://doi.org/10.1038/nrg3367
Neill NJ et al (2011) Recurrence, submicroscopic complexity, and potential clinical relevance of copy gains detected by array CGH that are shown to be unbalanced insertions by FISH. Genome Res 21:535–544. https://doi.org/10.1101/gr.114579.110
Nussenzweig A, Nussenzweig MC (2007) A backup DNA repair pathway moves to the forefront. Cell 131:223–225. https://doi.org/10.1016/j.cell.2007.10.005
O'Connor C (2008) Karyotyping for chromosomal abnormalities. Nat Educ 1(1):27
Ostertag EM, Prak ET, DeBerardinis RJ, Moran JV, Kazazian HH Jr (2000) Determination of L1 retrotransposition kinetics in cultured cells. Nucleic Acids Res 28:1418–1423. https://doi.org/10.1093/nar/28.6.1418
Paszkiewicz K, Studholme DJ (2010) De novo assembly of short sequence reads. Brief Bioinform 11:457–472. https://doi.org/10.1093/bib/bbq020
Payer LM et al (2017) Structural variants caused by Alu insertions are associated with risks for many human diseases. Proc Natl Acad Sci U S A 114:E3984–E3992. https://doi.org/10.1073/pnas.1704117114
Pellestor F (2019) Chromoanagenesis: cataclysms behind complex chromosomal rearrangements. Mol Cytogenet 12:6. https://doi.org/10.1186/s13039-019-0415-7
Pierce AJ, Johnson RD, Thompson LH, Jasin M (1999) XRCC3 promotes homology-directed repair of DNA damage in mammalian cells. Genes Dev 13:2633–2638. https://doi.org/10.1101/gad.13.20.2633
Pinkel D, Landegent J, Collins C, Fuscoe J, Segraves R, Lucas J, Gray J (1988) Fluorescence in situ hybridization with human chromosome-specific libraries: detection of trisomy 21 and translocations of chromosome 4. Proc Natl Acad Sci U S A 85:9138–9142. https://doi.org/10.1073/pnas.85.23.9138
Popejoy AB, Fullerton SM (2016) Genomics is failing on diversity. Nature 538:161–164. https://doi.org/10.1038/538161a
Porubsky D, Sanders AD, Taudt A, Colome-Tatche M, Lansdorp PM, Guryev V (2019) breakpointR: an R/bioconductor package to localize strand state changes in Strand-seq data. Bioinformatics. https://doi.org/10.1093/bioinformatics/btz681
Pounraja VK, Jayakar G, Jensen M, Kelkar N, Girirajan S (2019) A machine-learning approach for accurate detection of copy number variants from exome sequencing. Genome Res 29:1134–1143. https://doi.org/10.1101/gr.245928.118
Quinlan AR (2014) BEDTools: the Swiss-Army tool for genome feature analysis. Curr Protoc Bioinformatics 47:11.12.11-34. https://doi.org/10.1002/0471250953.bi1112s47
Quinlan AR, Hall IM (2012) Characterizing complex structural variation in germline and somatic genomes. Trends Genet 28:43–53. https://doi.org/10.1016/j.tig.2011.10.002
Rakocevic G et al (2019) Fast and accurate genomic analyses using genome graphs. Nat Genet 51:354–362. https://doi.org/10.1038/s41588-018-0316-4
Rausch T, Zichner T, Schlattl A, Stutz AM, Benes V, Korbel JO (2012) DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28:i333–i339. https://doi.org/10.1093/bioinformatics/bts378
Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP (2011) Integrative genomics viewer. Nat Biotechnol 29:24–26. https://doi.org/10.1038/nbt.1754
Rosenfeld JA, Mason CE, Smith TM (2012) Limitations of the human reference genome for personalized genomics. PLoS One 7:e40294. https://doi.org/10.1371/journal.pone.0040294
Ruan J, Li H (2019) Fast and accurate long-read assembly with wtdbg2. Nat Methods https://doi.org/10.1038/s41592-019-0669-3
Sanders AD, Hills M, Porubsky D, Guryev V, Falconer E, Lansdorp PM (2016) Characterizing polymorphic inversions in human genomes by single-cell sequencing. Genome Res 26:1575–1587. https://doi.org/10.1101/gr.201160.115
Sanders AD, Falconer E, Hills M, Spierings DCJ, Lansdorp PM (2017) Single-cell template strand sequencing by Strand-seq enables the characterization of individual homologs. Nat Protoc 12:1151–1176. https://doi.org/10.1038/nprot.2017.029
Sathirapongsasuti JF et al (2011) Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV. Bioinformatics 27:2648–2654. https://doi.org/10.1093/bioinformatics/btr462
Schroder J, Kumar A, Wong SQ (2019) Overview of fusion detection strategies using next-generation sequencing methods. Mol Biol 1908:125–138. https://doi.org/10.1007/978-1-4939-9004-7_9
Scully R, Panday A, Elango R, Willis NA (2019) DNA double-strand break repair-pathway choice in somatic mammalian cells. Nat Rev Mol Cell Biol. https://doi.org/10.1038/s41580-019-0152-0
Sedlazeck FJ, Lee H, Darby CA, Schatz MC (2018a) Piercing the dark matter: bioinformatics of long-range sequencing and mapping. Nat Rev Genet 19:329–346. https://doi.org/10.1038/s41576-018-0003-4
Sedlazeck FJ, Rescheneder P, Smolka M, Fang H, Nattestad M, von Haeseler A, Schatz MC (2018b) Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods 15:461–468. https://doi.org/10.1038/s41592-018-0001-7
Seo JS et al (2016) De novo assembly and phasing of a Korean human genome. Nature 538:243–247. https://doi.org/10.1038/nature20098
Sheen CR et al (2007) Double complex mutations involving F8 and FUNDC2 caused by distinct break-induced replication. Hum Mutat 28:1198–1206. https://doi.org/10.1002/humu.20591
Shen MM (2013) Chromoplexy: a new category of complex rearrangements in the cancer genome. Cancer Cell 23:567–569. https://doi.org/10.1016/j.ccr.2013.04.025
Shendure J, Ji H (2008) Next-generation DNA sequencing. Nat Biotechnol 26:1135–1145. https://doi.org/10.1038/nbt1486
Sherman RM et al (2019) Assembly of a pan-genome from deep sequencing of 910 humans of African descent. Nat Genet 51:30–35. https://doi.org/10.1038/s41588-018-0273-y
Shi L et al. (2016) Long-read sequencing and de novo assembly of a Chinese genome. Nat Commun 7:12065. https://doi.org/10.1038/ncomms12065
Shrivastav M, De Haro LP, Nickoloff JA (2008) Regulation of DNA double-strand break repair pathway choice. Cell Res 18:134–147. https://doi.org/10.1038/cr.2007.111
Smith SD, Kawash JK, Grigoriev A (2017) Lightning-fast genome variant detection with GROM. Gigascience 6:1–7. https://doi.org/10.1093/gigascience/gix091
Stankiewicz P, Lupski JR (2010) Structural variation in the human genome and its role in disease. Annu Rev Med 61:437–455. https://doi.org/10.1146/annurev-med-100708-204735
Stephens PJ et al (2011) Massive genomic rearrangement acquired in a single catastrophic event during cancer development. Cell 144:27–40. https://doi.org/10.1016/j.cell.2010.11.055
Sudmant PH et al (2015) An integrated map of structural variation in 2,504 human genomes. Nature 526:75–81. https://doi.org/10.1038/nature15394
Talevich E, Shain AH (2018) CNVkit-RNA: copy number inference from RNA-sequencing data. bioRxiv:408534. https://doi.org/10.1101/408534
Tattini L, D'Aurizio R, Magi A (2015) Detection of genomic structural variants from next-generation sequencing data front. Bioeng Biotechnol 3:92. https://doi.org/10.3389/fbioe.2015.00092
Teague B et al (2010) High-resolution human genome structure by single-molecule analysis. Proc Natl Acad Sci U S A 107:10848–10853. https://doi.org/10.1073/pnas.0914638107
Therman E, Susman B, Denniston C (1989) The nonrandom participation of human acrocentric chromosomes in Robertsonian translocations. Ann Hum Genet 53:49–65. https://doi.org/10.1111/j.1469-1809.1989.tb01121.x
Tian S, Yan H, Klee EW, Kalmbach M, Slager SL (2018) Comparative analysis of de novo assemblers for variation discovery in personal genomes. Brief Bioinform 19:893–904. https://doi.org/10.1093/bib/bbx037
Trask BJ (2002) Human cytogenetics: 46 chromosomes, 46 years and counting. Nat Rev Genet 3:769–778. https://doi.org/10.1038/nrg905
Tsai Y-C et al (2017) Amplification-free, CRISPR-Cas9 targeted enrichment and SMRT sequencing of repeat-expansion disease causative genomic regions. bioRxiv:203919. https://doi.org/10.1101/203919
Uhrig S, Fröhlich M, Hutter B, Brors B (2018) PO-400 Arriba—fast and accurate gene fusion detection from RNA-seq data. ESMO Open 3:A179–A179. https://doi.org/10.1136/esmoopen-2018-EACR25.426
Wala JA et al (2018) SvABA: genome-wide detection of structural variants and indels by local assembly. Genome Res 28:581–591. https://doi.org/10.1101/gr.221028.117
Wang K et al (2007) PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res 17:1665–1674. https://doi.org/10.1101/gr.6861907
Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10:57–63. https://doi.org/10.1038/nrg2484
Wang J et al (2011) CREST maps somatic structural variation in cancer genomes with base-pair resolution. Nat Methods 8:652–654. https://doi.org/10.1038/nmeth.1628
Wang M, Beck CR, English AC, Meng Q, Buhay C, Han Y, Doddapaneni HV, Yu F, Boerwinkle E, Lupski JR, Muzny DM, Gibbs RA (2015) PacBio-LITS: a large-insert targeted sequencing method for characterization of human disease-associated chromosomal structural variations. BMC Genomics 16:214. https://doi.org/10.1186/s12864-015-1370-2
Weckselblatt B, Rudd MK (2015) Human structural variation: mechanisms of chromosome rearrangements. Trends Genet 31:587–599. https://doi.org/10.1016/j.tig.2015.05.010
Wenger AM et al (2019) Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat Biotechnol 37:1155–1162. https://doi.org/10.1038/s41587-019-0217-9
Willis NA, Chandramouly G, Huang B, Kwok A, Follonier C, Deng C, Scully R (2014) BRCA1 controls homologous recombination at Tus/Ter-stalled mammalian replication forks. Nature 510:556–559. https://doi.org/10.1038/nature13295
Wiszniewska J et al (2014) Combined array CGH plus SNP genome analyses in a single assay for optimized clinical testing. Eur J Hum Genet 22:79–87. https://doi.org/10.1038/ejhg.2013.77
Ye K, Guo L, Yang X, Lamijer EW, Raine K, Ning Z (2018) Split-read indel and structural variant calling using PINDEL methods. Mol Biol 1833:95–105. https://doi.org/10.1007/978-1-4939-8666-8_7
Zarate S et al (2018) Parliament2: fast structural variant calling using optimized combinations of callers. bioRxiv:424267. https://doi.org/10.1101/424267
Zhang F, Carvalho CM, Lupski JR (2009a) Complex human chromosomal and genomic rearrangements. Trends Genet 25:298–307. https://doi.org/10.1016/j.tig.2009.05.005
Zhang F, Khajavi M, Connolly AM, Towne CF, Batish SD, Lupski JR (2009b) The DNA replication FoSTeS/MMBIR mechanism can generate genomic, genic and exonic complex rearrangements in humans. Nat Genet 41:849–853. https://doi.org/10.1038/ng.399
Zheng GX et al (2016) Haplotyping germline and cancer genomes with high-throughput linked-read sequencing. Nat Biotechnol 34:303–311. https://doi.org/10.1038/nbt.3432
Zimin AV, Marcais G, Puiu D, Roberts M, Salzberg SL, Yorke JA (2013) The MaSuRCA genome assembler. Bioinformatics 29:2669–2677. https://doi.org/10.1093/bioinformatics/btt476
Acknowledgements
We thank the members of the Beck lab for reading and editing the review, in particular Alex V. Nesta. This work was supported in part by the National Institute of General Medical Sciences grants R00GM120453 and R35GM133600 and startup funds from the University of Connecticut Health and the Jackson Laboratory to Christine R. Beck.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible Editor: Beth Sullivan
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Balachandran, P., Beck, C.R. Structural variant identification and characterization. Chromosome Res 28, 31–47 (2020). https://doi.org/10.1007/s10577-019-09623-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10577-019-09623-z