当前位置: X-MOL 学术J. Supercomput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Like a rainbow in the dark: metadata annotation for HPC applications in the age of dark data
The Journal of Supercomputing ( IF 3.3 ) Pub Date : 2021-02-01 , DOI: 10.1007/s11227-020-03602-6
Björn Schembera

The deluge of dark data is about to happen. Lacking data management capabilities, especially in the field of supercomputing, and missing data documentation (i.e., missing metadata annotation) constitute a major source of dark data. The present work contributes to addressing this challenge by presenting ExtractIng, a generic automated metadata extraction toolkit. Existing metadata information of simulation output files scattered through the file system, can be aggregated, parsed and converted to the EngMeta metadata model. Use cases from computational engineering are considered to demonstrate the viability of ExtractIng. The evaluation results show that the metadata extraction is simulation-code independent in the sense that it can handle data outputs from various fields of science, is easy to integrate into simulation workflows and compatible with a multitude of computational environments.



中文翻译:

就像黑暗中的彩虹:黑暗数据时代的HPC应用程序的元数据注释

大量黑暗数据将要发生。缺乏数据管理功能,尤其是在超级计算领域,缺少数据文档(即,缺少元数据注释)构成了暗数据的主要来源。当前的工作通过展示ExtractIng有助于应对这一挑战。,一种通用的自动元数据提取工具包。可以将分散在文件系统中的模拟输出文件的现有元数据信息汇总,解析并转换为EngMeta元数据模型。计算工程的用例被认为可以证明ExtractIng的可行性。评估结果表明,元数据提取在某种意义上是独立于仿真代码的,因为它可以处理来自各个科学领域的数据输出,易于集成到仿真工作流中并与多种计算环境兼容。

更新日期:2021-02-01
down
wechat
bug