当前位置: X-MOL 学术J. Biol. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
High capacity DNA data storage with variable-length Oligonucleotides using repeat accumulate code and hybrid mapping.
Journal of Biological Engineering ( IF 5.7 ) Pub Date : 2019-11-21 , DOI: 10.1186/s13036-019-0211-2
Yixin Wang 1 , Md Noor-A-Rahim 2 , Jingyun Zhang 3, 4 , Erry Gunawan 1 , Yong Liang Guan 1 , Chueh Loo Poh 3, 4
Affiliation  

Background With the inherent high density and durable preservation, DNA has been recently recognized as a distinguished medium to store enormous data over millennia. To overcome the limitations existing in a recently reported high-capacity DNA data storage while achieving a competitive information capacity, we are inspired to explore a new coding system that facilitates the practical implementation of DNA data storage with high capacity. Result In this work, we devised and implemented a DNA data storage scheme with variable-length oligonucleotides (oligos), where a hybrid DNA mapping scheme that converts digital data to DNA records is introduced. The encoded DNA oligos stores 1.98 bits per nucleotide (bits/nt) on average (approaching the upper bound of 2 bits/nt), while conforming to the biochemical constraints. Beyond that, an oligo-level repeat-accumulate coding scheme is employed for addressing data loss and corruption in the biochemical processes. With a wet-lab experiment, an error-free retrieval of 379.1 KB data with a minimum coverage of 10x is achieved, validating the error resilience of the proposed coding scheme. Along with that, the theoretical analysis shows that the proposed scheme exhibits a net information density (user bits per nucleotide) of 1.67 bits/nt while achieving 91% of the information capacity. Conclusion To advance towards practical implementations of DNA storage, we proposed and tested a DNA data storage system enabling high potential mapping (bits to nucleotide conversion) scheme and low redundancy but highly efficient error correction code design. The advancement reported would move us closer to achieving a practical high-capacity DNA data storage system.

中文翻译:


使用重复累积代码和混合映射的可变长度寡核苷酸的高容量 DNA 数据存储。



背景 凭借固有的高密度和持久保存,DNA 最近被认为是存储数千年大量数据的杰出介质。为了克服最近报道的大容量 DNA 数据存储中存在的局限性,同时实现有竞争力的信息容量,我们受到启发,探索一种新的编码系统,以促进高容量 DNA 数据存储的实际实现。结果在这项工作中,我们设计并实现了一种具有可变长度寡核苷酸(oligo)的 DNA 数据存储方案,其中引入了将数字数据转换为 DNA 记录的混合 DNA 映射方案。编码的 DNA 寡核苷酸平均每个核苷酸存储 1.98 位 (bits/nt)(接近 2 位/nt 的上限),同时符合生化限制。除此之外,还采用寡头级重复累积编码方案来解决生化过程中的数据丢失和损坏问题。通过湿实验室实验,实现了 379.1 KB 数据的无错误检索,最小覆盖率为 10 倍,验证了所提出的编码方案的错误恢复能力。除此之外,理论分析表明,所提出的方案表现出 1.67 位/nt 的净信息密度(每个核苷酸的用户位),同时实现了 91% 的信息容量。结论 为了推进 DNA 存储的实际实现,我们提出并测试了一种 DNA 数据存储系统,该系统能够实现高潜力映射(位到核苷酸转换)方案和低冗余但高效的纠错码设计。所报告的进展将使我们更接近实现实用的高容量 DNA 数据存储系统。
更新日期:2020-04-22
down
wechat
bug