当前位置: X-MOL 学术arXiv.cs.SY › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
ProvLet: A Provenance Management Service for Long Tail Microscopy Data
arXiv - CS - Systems and Control Pub Date : 2021-09-22 , DOI: arxiv-2109.10897
Hessam Moeini, Todd Nicholson, Klara Nahrstedt, Gianni Pezzarossi

Provenance management must be present to enhance the overall security and reliability of long-tail microscopy (LTM) data management systems. However, there are challenges in provenance for domains with LTM data. The provenance data need to be collected more frequently, which increases system overheads (in terms of computation and storage) and results in scalability issues. Moreover, in most scientific application domains a provenance solution must consider network-related events as well. Therefore, provenance data in LTM data management systems are highly diverse and must be organized and processed carefully. In this paper, we introduce a novel provenance service, called ProvLet, to collect, distribute, analyze, and visualize provenance data in LTM data management systems. This means (1) we address how to filter and store the desired transactions on disk; (2) we consider a data organization model at higher level data abstractions, suitable for step-by-step scientific experiments, such as datasets and collections, and develop provenance algorithms over these data abstractions, rather than solutions considering low-level abstractions such as files and folders. (3) We utilize ProvLet's log files and visualize provenance information for further forensics explorations. The validation of ProvLet with actual long tail microscopy data, collected over a period of six years, shows a provenance service that yields a low system overhead and enables scalability.

中文翻译:

ProvLet:长尾显微镜数据的来源管理服务

必须存在来源管理,以提高长尾显微镜 (LTM) 数据管理系统的整体安全性和可靠性。但是,具有 LTM 数据的域在来源方面存在挑战。需要更频繁地收集来源数据,这会增加系统开销(在计算和存储方面)并导致可扩展性问题。此外,在大多数科学应用领域,出处解决方案还必须考虑与网络相关的事件。因此,LTM 数据管理系统中的出处数据高度多样化,必须仔细组织和处理。在本文中,我们介绍了一种名为 ProvLet 的新型出处服务,用于收集、分发、分析和可视化 LTM 数据管理系统中的出处数据。这意味着(1)我们解决了如何在磁盘上过滤和存储所需的事务;(2) 我们在更高级别的数据抽象上考虑数据组织模型,适用于循序渐进的科学实验,例如数据集和集合,并在这些数据抽象上开发出处算法,而不是考虑低级抽象的解决方案,例如文件和文件夹。(3) 我们利用 ProvLet 的日志文件并将来源信息可视化,以进行进一步的取证探索。ProvLet 与实际长尾显微镜数据的验证,在六年的时间里收集,显示了一种产生低系统开销并实现可扩展性的来源服务。例如数据集和集合,并在这些数据抽象上开发出处算法,而不是考虑文件和文件夹等低级抽象的解决方案。(3) 我们利用 ProvLet 的日志文件并将来源信息可视化,以进行进一步的取证探索。ProvLet 与实际长尾显微镜数据的验证,在六年的时间里收集,显示了一种产生低系统开销并实现可扩展性的来源服务。例如数据集和集合,并在这些数据抽象上开发出处算法,而不是考虑文件和文件夹等低级抽象的解决方案。(3) 我们利用 ProvLet 的日志文件并将来源信息可视化,以进行进一步的取证探索。ProvLet 与实际长尾显微镜数据的验证,在六年的时间里收集,显示了一种产生低系统开销并实现可扩展性的来源服务。
更新日期:2021-09-23
down
wechat
bug