当前位置: X-MOL 学术arXiv.cs.DB › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient Data Management in Neutron Scattering Data Reduction Workflows at ORNL
arXiv - CS - Databases Pub Date : 2021-01-05 , DOI: arxiv-2101.02591
William F Godoy, Peter F Peterson, Steven E Hahn, Jay J Billings

Oak Ridge National Laboratory (ORNL) experimental neutron science facilities produce 1.2\,TB a day of raw event-based data that is stored using the standard metadata-rich NeXus schema built on top of the HDF5 file format. Performance of several data reduction workflows is largely determined by the amount of time spent on the loading and processing algorithms in Mantid, an open-source data analysis framework used across several neutron sciences facilities around the world. The present work introduces new data management algorithms to address identified input output (I/O) bottlenecks on Mantid. First, we introduce an in-memory binary-tree metadata index that resemble NeXus data access patterns to provide a scalable search and extraction mechanism. Second, data encapsulation in Mantid algorithms is optimally redesigned to reduce the total compute and memory runtime footprint associated with metadata I/O reconstruction tasks. Results from this work show speed ups in wall-clock time on ORNL data reduction workflows, ranging from 11\% to 30\% depending on the complexity of the targeted instrument-specific data. Nevertheless, we highlight the need for more research to address reduction challenges as experimental data volumes increase.

中文翻译:

ORNL中子散射数据归约工作流程中的高效数据管理

橡树岭国家实验室(ORNL)的实验中子科学设施每天产生1.2 TB的基于事件的原始数据,这些数据是使用基于HDF5文件格式构建的标准的富含元数据的NeXus模式存储的。几种数据缩减工作流程的性能在很大程度上取决于在Mantid中的加载和处理算法上花费的时间,Mantid是一种在全球多个中子科学机构中使用的开源数据分析框架。本工作介绍了新的数据管理算法,以解决Mantid上已识别的输入输出(I / O)瓶颈。首先,我们引入类似于NeXus数据访问模式的内存二叉树元数据索引,以提供可扩展的搜索和提取机制。第二,对Mantid算法中的数据封装进行了最佳重新设计,以减少与元数据I / O重建任务相关的总计算和内存运行时占用空间。这项工作的结果表明,根据目标仪器特定数据的复杂性,ORNL数据缩减工作流的挂钟时间加快了11%至30%。然而,随着实验数据量的增加,我们强调需要开展更多的研究来应对减排挑战。
更新日期:2021-01-08
down
wechat
bug