ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems,Journal of Computer Science and Technology

当前位置： X-MOL 学术 › J. Comput. Sci. Tech. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems
Journal of Computer Science and Technology ( IF 1.2 ) Pub Date : 2020-01-01 , DOI: 10.1007/s11390-020-9822-9
Suren Byna , M. Scot Breitenfeld , Bin Dong , Quincey Koziol , Elena Pourmal , Dana Robinson , Jerome Soumagne , Houjun Tang , Venkatram Vishwanath , Richard Warren

Scientific applications at exascale generate and analyze massive amounts of data. A critical requirement of these applications is the capability to access and manage this data efficiently on exascale systems. Parallel I/O, the key technology enables moving data between compute nodes and storage, faces monumental challenges from new applications, memory, and storage architectures considered in the designs of exascale systems. As the storage hierarchy is expanding to include node-local persistent memory, burst buffers, etc., as well as disk-based storage, data movement among these layers must be efficient. Parallel I/O libraries of the future should be capable of handling file sizes of many terabytes and beyond. In this paper, we describe new capabilities we have developed in Hierarchical Data Format version 5 (HDF5), the most popular parallel I/O library for scientific applications. HDF5 is one of the most used libraries at the leadership computing facilities for performing parallel I/O on existing HPC systems. The state-of-the-art features we describe include: Virtual Object Layer (VOL), Data Elevator, asynchronous I/O, full-featured single-writer and multiple-reader (Full SWMR), and parallel querying. In this paper, we introduce these features, their implementations, and the performance and feature benefits to applications and other libraries.

中文翻译：

ExaHDF5：在 Exascale 计算系统上提供高效的并行 I/O

百亿亿级的科学应用会生成和分析大量数据。这些应用程序的一个关键要求是能够在百亿亿级系统上有效地访问和管理这些数据。并行 I/O 是支持在计算节点和存储之间移动数据的关键技术，它面临着百亿亿级系统设计中考虑的新应用程序、内存和存储架构的巨大挑战。随着存储层次结构扩展到包括节点本地持久内存、突发缓冲区等，以及基于磁盘的存储，这些层之间的数据移动必须高效。未来的并行 I/O 库应该能够处理数 TB 甚至更大的文件大小。在本文中，我们描述了我们在分层数据格式版本 5 (HDF5) 中开发的新功能，科学应用中最流行的并行 I/O 库。HDF5 是领先计算设施中最常用的库之一，用于在现有 HPC 系统上执行并行 I/O。我们描述的最先进的功能包括：虚拟对象层 (VOL)、数据电梯、异步 I/O、功能齐全的单写入器和多读取器 (Full SWMR) 以及并行查询。在本文中，我们将介绍这些特性、它们的实现以及对应用程序和其他库的性能和特性优势。全功能的单写入器和多读取器（完整 SWMR）以及并行查询。在本文中，我们将介绍这些特性、它们的实现以及对应用程序和其他库的性能和特性优势。全功能的单写入器和多读取器（完整 SWMR）以及并行查询。在本文中，我们将介绍这些特性、它们的实现以及对应用程序和其他库的性能和特性优势。

更新日期：2020-01-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11