当前位置: X-MOL 学术Cluster Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Applying neural networks to predict HPC-I/O bandwidth over seismic data on lustre file system for ExSeisDat
Cluster Computing ( IF 4.4 ) Pub Date : 2021-07-02 , DOI: 10.1007/s10586-021-03347-8
Abdul Jabbar Saeed Tipu 1, 2 , Padraig Ó Conbhuí 2 , Enda Howley 1
Affiliation  

HPC or super-computing clusters are designed for executing computationally intensive operations that typically involve large scale I/O operations. This most commonly involves using a standard MPI library implemented in C/C++. The MPI-I/O performance in HPC clusters tends to vary significantly over a range of configuration parameters that are generally not taken into account by the algorithm. It is commonly left to individual practitioners to optimise I/O on a case by case basis at code level. This can often lead to a range of unforeseen outcomes. The ExSeisDat utility is built on top of the native MPI-I/O library comprising of Parallel I/O and Workflow Libraries to process seismic data encapsulated in SEG-Y file format. The SEG-Y File data structure is complex in nature, due to the alternative arrangement of trace header and trace data. Its size scales to petabytes and the chances of I/O performance degradation are further increased by ExSeisDat. This research paper presents a novel study of the changing I/O performance in terms of bandwidth, with the use of parallel plots against various MPI-I/O, Lustre (Parallel) File System and SEG-Y File parameters. Another novel aspect of this research is the predictive modelling of MPI-I/O behaviour over SEG-Y File benchmarks using Artificial Neural Networks (ANNs). The accuracy ranges from 62.5% to 96.5% over the set of trained ANN models. The computed Mean Square Error (MSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) values further support the generalisation of the prediction models. This paper demonstrates that by using our ANNs prediction technique, the configurations can be tuned beforehand to avoid poor I/O performance.



中文翻译:

应用神经网络预测 ExSeisDat 光泽文件系统上地震数据的 HPC-I/O 带宽

HPC 或超级计算集群设计用于执行通常涉及大规模 I/O 操作的计算密集型操作。这通常涉及使用以 C/C++ 实现的标准 MPI 库。HPC 集群中的 MPI-I/O 性能往往会在算法通常不考虑的配置参数范围内发生显着变化。通常由个人从业者在代码级别根据具体情况优化 I/O。这通常会导致一系列不可预见的结果。ExSeisDat 实用程序建立在包含并行 I/O 和工作流库的本地 MPI-I/O 库之上,用于处理以 SEG-Y 文件格式封装的地震数据。由于跟踪头和跟踪数据的交替排列,SEG-Y 文件数据结构本质上是复杂的。ExSeisDat 进一步增加了它的大小到 PB 级,并且 I/O 性能下降的可能性进一步增加。本研究论文提出了一项关于带宽方面不断变化的 I/O 性能的新研究,使用了针对各种 MPI-I/O、Lustre(并行)文件系统和 SEG-Y 文件参数的并行图。这项研究的另一个新颖方面是使用人工神经网络 (ANN) 在 SEG-Y 文件基准上对 MPI-I/O 行为进行预测建模。在经过训练的 ANN 模型集上,准确率范围为 62.5% 到 96.5%。计算的均方误差 (MSE)、平均绝对误差 (MAE) 和平均绝对百分比误差 (MAPE) 值进一步支持预测模型的泛化。本文证明,通过使用我们的人工神经网络预测技术,

更新日期:2021-07-02
down
wechat
bug