当前位置: X-MOL 学术bioRxiv. Synth. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient DNA-based data storage using shortmer combinatorial encoding
bioRxiv - Synthetic Biology Pub Date : 2024-01-02 , DOI: 10.1101/2021.08.01.454622
Inbal Preuss , Zohar Yakhini , Leon Anavy

With the world generating digital data at an exponential rate, DNA has emerged as a promising archival medium. It offers a more efficient and long-lasting digital storage solution due to its durability, physical density, and high information capacity. Research in the field includes the development of encoding schemes, which are compatible with existing DNA synthesis and sequencing technologies. Recent studies suggest leveraging the inherent information redundancy of these technologies by using composite DNA alphabets. A major challenge in this approach involves the noisy inference process, which prevented the use of large composite alphabets. This paper introduces a novel approach for DNA-based data storage, offering a 6.5-fold increase in logical density over standard DNA-based storage systems, with near zero reconstruction error. Combinatorial DNA encoding uses a set of clearly distinguishable DNA shortmers to construct large combinatorial alphabets, where each letter represents a subset of shortmers. The nature of these combinatorial alphabets minimizes mix-up errors, while also ensuring the robustness of the system.

中文翻译:

使用短组合编码的高效基于 DNA 的数据存储

随着世界以指数级速度产生数字数据,DNA 已成为一种有前途的档案介质。由于其耐用性、物理密度和高信息容量,它提供了更高效、更持久的数字存储解决方案。该领域的研究包括开发与现有 DNA 合成和测序技术兼容的编码方案。最近的研究表明,通过使用复合 DNA 字母表来利用这些技术固有的信息冗余。这种方法的一个主要挑战涉及噪声推理过程,这阻碍了大型复合字母表的使用。本文介绍了一种基于 DNA 的数据存储的新方法,其逻辑密度比标准的基于 DNA 的存储系统增加了 6.5 倍,并且重构误差接近于零。组合 DNA 编码使用一组明显可区分的 DNA 短聚物来构建大型组合字母表,其中每个字母代表短聚物的子集。这些组合字母表的性质最大限度地减少了混淆错误,同时也确保了系统的稳健性。
更新日期:2024-01-02
down
wechat
bug