Design and implementation of an efficient and parallel bad block checker for parallelism of storage devices,Cluster Computing

当前位置： X-MOL 学术 › Cluster Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Design and implementation of an efficient and parallel bad block checker for parallelism of storage devices
Cluster Computing ( IF 3.6 ) Pub Date : 2021-07-02 , DOI: 10.1007/s10586-021-03353-w
Jaehyun Han ₁ , Guangyu Zhu ₁ , Eunseo Lee ₁ , Yongseok Son ₁

Affiliation

Cloud computing is being widely adopted in the industry due to providing more computation power and improve resource utilization. In cloud computing systems, many users execute various types of applications that produce a large amount of data. To handle a large amount of data, cloud computing systems provide various, high-performance, and large-scale clustered storage devices. With the large capacity, improving the performance of storage maintenance is an important issue since the large capacity can increase the suspend time during the maintenance significantly. As a storage maintenance technique, checking a bad block in which the data cannot be accessed anymore prevents I/O failure of the application. However, an existing bad block checker (e.g., badblocks in Linux) takes a long time, even when storage devices provide parallelism (e.g., multiple disks, multi-channel SSD, etc). It is because the existing bad block checker performs I/O and check operations in a serialized manner. To reduce the checking time, we propose an efficient and parallel bad block checker for exploiting the parallelism of storage devices. In our scheme, we enable parallel I/O and check operations for the bad block instead of the serialized operations. To do this, we first divide a series of check operations into parallel tasks (i.e., independent tasks). Second, we create a thread pool in which multiple workers fetch their tasks concurrently. Finally, we enable each checker to perform its own check and I/O operations in parallel. We implement and evaluate our checker on a 32-core machine with a disk array and an NVMe SSD. The experimental results show that our proposed bad block checker improves the performance by up to 3.7\(\times\) and 7.8\(\times\) in the disk array and NVMe SSD, respectively, compared with the existing bad block checker.

中文翻译：

一种高效并行的存储设备并行坏块检测器的设计与实现

由于提供更多的计算能力和提高资源利用率，云计算正在被业界广泛采用。在云计算系统中，许多用户执行产生大量数据的各种类型的应用程序。为了处理大量数据，云计算系统提供了各种高性能、大规模的集群存储设备。对于大容量，提高存储维护的性能是一个重要的问题，因为大容量会显着增加维护期间的暂停时间。作为一种存储维护技术，检查无法再访问数据的坏块可以防止应用程序的 I/O 故障。然而，现有的坏块检查器（例如 Linux 中的坏块）需要很长时间，即使存储设备提供并行性（例如，多个磁盘、多通道 SSD 等）。这是因为现有的坏块检查器以序列化的方式执行 I/O 和检查操作。为了减少检查时间，我们提出了一种高效并行的坏块检查器，用于利用存储设备的并行性。在我们的方案中，我们启用并行 I/O 并检查坏块的操作，而不是序列化操作。为此，我们首先将一系列检查操作划分为并行任务（即独立任务）。其次，我们创建了一个线程池，其中多个工作人员同时获取他们的任务。最后，我们使每个检查器能够并行执行自己的检查和 I/O 操作。我们在具有磁盘阵列和 NVMe SSD 的 32 核机器上实施和评估我们的检查器。\(\times\)和 7.8 \(\times\)在磁盘阵列和 NVMe SSD 中，与现有的坏块检查器相比。

更新日期：2021-07-02

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11