当前位置: X-MOL 学术J. Comput. Sci. Tech. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Design and Implementation of the Tianhe-2 Data Storage and Management System
Journal of Computer Science and Technology ( IF 1.9 ) Pub Date : 2020-01-01 , DOI: 10.1007/s11390-020-9799-4
Yu-Tong Lu , Peng Cheng , Zhi-Guang Chen

With the convergence of high-performance computing (HPC), big data and artificial intelligence (AI), the HPC community is pushing for “triple use” systems to expedite scientific discoveries. However, supporting these converged applications on HPC systems presents formidable challenges in terms of storage and data management due to the explosive growth of scientific data and the fundamental differences in I/O characteristics among HPC, big data and AI workloads. In this paper, we discuss the driving force behind the converging trend, highlight three data management challenges, and summarize our efforts in addressing these data management challenges on a typical HPC system at the parallel file system, data management middleware, and user application levels. As HPC systems are approaching the border of exascale computing, this paper sheds light on how to enable application-driven data management as a preliminary step toward the deep convergence of exascale computing ecosystems, big data, and AI.

中文翻译:

天河二号数据存储与管理系统的设计与实现

随着高性能计算 (HPC)、大数据和人工智能 (AI) 的融合,HPC 社区正在推动“三用”系统以加速科学发现。然而,由于科学数据的爆炸式增长以及 HPC、大数据和 AI 工作负载之间 I/O 特性的根本差异,在 HPC 系统上支持这些融合应用程序在存储和数据管理方面提出了艰巨的挑战。在本文中,我们讨论了融合趋势背后的驱动力,突出了三个数据管理挑战,并总结了我们在并行文件系统、数据管理中间件和用户应用程序级别的典型 HPC 系统上解决这些数据管理挑战的努力。随着 HPC 系统接近百亿亿次计算的边界,
更新日期:2020-01-01
down
wechat
bug