当前位置: X-MOL 学术Nat. Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Quantifying molecular bias in DNA data storage.
Nature Communications ( IF 16.6 ) Pub Date : 2020-06-29 , DOI: 10.1038/s41467-020-16958-3
Yuan-Jyue Chen 1 , Christopher N Takahashi 2 , Lee Organick 2 , Callista Bee 2 , Siena Dumas Ang 1 , Patrick Weiss 3 , Bill Peck 3 , Georg Seelig 2, 4 , Luis Ceze 2 , Karin Strauss 1
Affiliation  

DNA has recently emerged as an attractive medium for archival data storage. Recent work has demonstrated proof-of-principle prototype systems; however, very uneven (biased) sequencing coverage has been reported, which indicates inefficiencies in the storage process. Deviations from the average coverage in the sequence copy distribution can either cause wasteful provisioning in sequencing or excessive number of missing sequences. Here, we use millions of unique sequences from a DNA-based digital data archival system to study the oligonucleotide copy unevenness problem and show that the two paramount sources of bias are the synthesis and amplification (PCR) processes. Based on these findings, we develop a statistical model for each molecular process as well as the overall process. We further use our model to explore the trade-offs between synthesis bias, storage physical density, logical redundancy, and sequencing redundancy, providing insights for engineering efficient, robust DNA data storage systems.



中文翻译:

量化DNA数据存储中的分子偏向。

DNA最近已成为档案数据存储的诱人媒介。最近的工作证明了原理证明原型系统。但是,已经报道了非常不均匀的(有偏见的)测序覆盖率,这表明存储过程效率低下。序列拷贝分布中平均覆盖率的偏离可能导致测序中的浪费,或者丢失序列过多。在这里,我们使用来自基于DNA的数字数据归档系统中的数百万个独特序列来研究寡核苷酸复制不均问题,并表明两个最重要的偏差来源是合成和扩增(PCR)过程。基于这些发现,我们为每个分子过程以及整个过程建立了统计模型。

更新日期:2020-06-29
down
wechat
bug