当前位置: X-MOL 学术IEEE Trans. Parallel Distrib. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimised Lambda Architecture for monitoring scientific infrastructure
IEEE Transactions on Parallel and Distributed Systems ( IF 5.3 ) Pub Date : 2021-06-01 , DOI: 10.1109/tpds.2017.2772241
Uthayanath Suthakar , Luca Magnoni , David Ryan Smith , Akram Khan

Within scientific infrastructure scientists execute millions of computational jobs daily, resulting in the movement of petabytes of data over the heterogeneous infrastructure. Monitoring the computing and user activities over such a complex infrastructure is incredibly demanding. Whereas present solutions are traditionally based on a Relational Database Management System (RDBMS) for data storage and processing, recent developments evaluate the Lambda Architecture (LA). In particular these studies have evaluated data storage and batch processing for processing large-scale monitoring datasets using Hadoop and its MapReduce framework. Although LA performed better than the RDBMS following evaluation, it was fairly complex to implement and maintain. This paper presents an Optimised Lambda Architecture (OLA) using the Apache Spark ecosystem, which involves modelling an efficient way of joining batch computation and real-time computation transparently without the need to add complexity. A few models were explored: pure streaming, pure batch computation, and the combination of both batch and streaming. An evaluation of the OLA on the Worldwide LHC Computing Grid (WLCG) Hadoop cluster and the public Amazon cloud infrastructure for the monitoring WLCG Data acTivities (WDT) use case are both presented, demonstrating how the new architecture can offer benefits by combining both batch and real-time processing to compensate for batch-processing latency.

中文翻译:

用于监控科学基础设施的优化 Lambda 架构

在科学基础设施中,科学家每天执行数百万次计算工作,导致 PB 级数据在异构基础设施上移动。在如此复杂的基础设施上监控计算和用户活动的要求非常高。虽然目前的解决方案传统上基于用于数据存储和处理的关系数据库管理系统 (RDBMS),但最近的发展评估了 Lambda 架构 (LA)。特别是这些研究评估了使用 Hadoop 及其 MapReduce 框架处理大规模监控数据集的数据存储和批处理。尽管 LA 在评估后表现优于 RDBMS,但实施和维护相当复杂。本文介绍了使用 Apache Spark 生态系统的优化 Lambda 架构 (OLA),这涉及建模一种透明地连接批量计算和实时计算的有效方法,而无需增加复杂性。探索了一些模型:纯流、纯批处理计算以及批处理和流的组合。对全球 LHC 计算网格 (WLCG) Hadoop 集群上的 OLA 和用于监控 WLCG 数据活动 (WDT) 用例的公共 Amazon 云基础设施进行了评估,展示了新架构如何通过结合批处理和实时处理以补偿批处理延迟。以及批处理和流媒体的结合。对全球 LHC 计算网格 (WLCG) Hadoop 集群上的 OLA 和用于监控 WLCG 数据活动 (WDT) 用例的公共 Amazon 云基础设施进行了评估,展示了新架构如何通过结合批处理和实时处理以补偿批处理延迟。以及批处理和流媒体的结合。对全球 LHC 计算网格 (WLCG) Hadoop 集群上的 OLA 和用于监控 WLCG 数据活动 (WDT) 用例的公共 Amazon 云基础设施进行了评估,展示了新架构如何通过结合批处理和实时处理以补偿批处理延迟。
更新日期:2021-06-01
down
wechat
bug