Reconstructing double-stranded DNA fragments on a single-molecule level reveals patterns of degradation in ancient samples

  1. Matthias Meyer
  1. Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, 04103 Leipzig, Germany
  • 1 Present address: Francis Crick Institute, London NW1 1AT, United Kingdom

  • Corresponding authors: lukas_bokelmann{at}eva.mpg.de, mmeyer{at}eva.mpg.de
  • Abstract

    Extensive manipulations involved in the preparation of DNA samples for sequencing have hitherto made it impossible to determine the precise structure of double-stranded DNA fragments being sequenced, such as the presence of blunt ends, single-stranded overhangs, or single-strand breaks. We here describe MatchSeq, a method that combines single-stranded DNA library preparation from diluted DNA samples with computational sequence matching, allowing the reconstruction of double-stranded DNA fragments on a single-molecule level. The application of MatchSeq to Neanderthal DNA, a particularly complex source of degraded DNA, reveals that 1- or 2-nt overhangs and blunt ends dominate the ends of ancient DNA molecules and that short gaps exist, which are predominantly caused by the loss of individual purines. We further show that deamination of cytosine to uracil occurs in both single- and double-stranded contexts close to the ends of molecules, and that single-stranded parts of DNA fragments are enriched in pyrimidines. MatchSeq provides unprecedented resolution for interrogating the structures of fragmented double-stranded DNA and can be applied to fragmented double-stranded DNA isolated from any biological source. The method relies on well-established laboratory techniques and can easily be integrated into routine data generation. This possibility is shown by the successful reconstruction of double-stranded DNA fragments from previously published single-stranded sequence data, allowing a more comprehensive characterization of the biochemical properties not only of ancient DNA but also of cell-free DNA from human blood plasma, a clinically relevant marker for the diagnosis and monitoring of disease.

    Footnotes

    • Received March 24, 2020.
    • Accepted August 7, 2020.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    | Table of Contents

    Preprint Server