Connection Science ( IF 3.2 ) Pub Date : 2020-12-21 , DOI: 10.1080/09540091.2020.1862058 Xiaotong Wu 1 , Jiaquan Gao 1 , Genlin Ji 1 , Taotao Wu 2 , Yuan Tian 3 , Najla Al-Nabhan 4
ABSTRACT
With the fast development of various computing paradigms, the amount of data is rapidly increasing that brings the huge storage overhead. However, the existing data deduplication techniques do not make full use of similarity detection to improve the storage efficiency and data transmission rate. In this paper, we study the problem of utilising the duplicate and resemblance detection techniques to further compress data. We first present a framework of
中文翻译:
一种基于特征的智能重复数据删除压缩系统,具有极端相似性检测
摘要
随着各种计算范式的快速发展,数据量迅速增加,带来了巨大的存储开销。然而,现有的重复数据删除技术并没有充分利用相似性检测来提高存储效率和数据传输速率。在本文中,我们研究了利用重复和相似检测技术进一步压缩数据的问题。我们首先提出一个框架