当前位置: X-MOL 学术Concurr. Comput. Pract. Exp. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SimpleSync: A parallel delta synchronization method based on Flink
Concurrency and Computation: Practice and Experience ( IF 1.5 ) Pub Date : 2021-07-22 , DOI: 10.1002/cpe.6327
Changjian Zhang 1 , Deyu Qi 1 , Wenlin Li 1 , Wenhao Huang 1 , Xinyang Wang 2
Affiliation  

Cloud storage service has been in full swing in the industry. Delta synchronization technology as a key technology of cloud storage services has not made a key breakthrough. Almost all the existing researches are based on the synchronization process proposed by the Rsync algorithm and mix some optimization appropriately, but the particularity of cloud storage service is not fully considered. This paper proposes a new incremental synchronization method SimpleSync, which makes use of the characteristic that the server does not actively modify the backup files in the cloud storage service, removes the redundant steps in Rsync, and enables the synchronization between the client and the server only through a single communication. Besides, according to the server-side synchronization request processing logic, this paper puts forward the design idea of parallel processing with the Flink framework, to the best of our knowledge, for the first time. After the server receives the synchronization request, SimpleSync first puts it into Kafka for buffering and then uses Flink to process the synchronization request in parallel. In the experimental part, a large number of experiments are designed to compare SimpleSync with other delta synchronization algorithms. Experimental results show that SimpleSync has obvious advantages in synchronization performance. Meanwhile, experiments show that SimpleSync has correctness.

中文翻译:

SimpleSync:一种基于 Flink 的并行增量同步方法

云存储服务在业界已经如火如荼。Delta同步技术作为云存储服务的关键技术,并没有取得关键突破。现有的研究几乎都是基于Rsync算法提出的同步过程,并适当混合了一些优化,但没有充分考虑云存储服务的特殊性。本文提出了一种新的增量同步方法SimpleSync,它利用服务器不会主动修改云存储服务中的备份文件的特点,去掉Rsync中的冗余步骤,只实现客户端和服务器之间的同步。通过一次通讯。另外,根据服务端同步请求处理逻辑,据我们所知,本文首次提出了使用 Flink 框架进行并行处理的设计思想。服务端收到同步请求后,SimpleSync首先将其放入Kafka缓存,然后使用Flink并行处理同步请求。在实验部分,设计了大量的实验来比较 SimpleSync 与其他增量同步算法。实验结果表明,SimpleSync 在同步性能上具有明显的优势。同时,实验表明SimpleSync 具有正确性。SimpleSync 先将其放入 Kafka 进行缓冲,然后使用 Flink 并行处理同步请求。在实验部分,设计了大量的实验来比较 SimpleSync 与其他增量同步算法。实验结果表明,SimpleSync 在同步性能上具有明显的优势。同时,实验表明SimpleSync 具有正确性。SimpleSync 先将其放入 Kafka 进行缓冲,然后使用 Flink 并行处理同步请求。在实验部分,设计了大量的实验来比较 SimpleSync 与其他增量同步算法。实验结果表明,SimpleSync 在同步性能上具有明显的优势。同时,实验表明SimpleSync 具有正确性。
更新日期:2021-09-22
down
wechat
bug