Towards autonomic data management for staging-based coupled scientific workflows,Journal of Parallel and Distributed Computing

当前位置： X-MOL 学术 › J. Parallel Distrib. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Towards autonomic data management for staging-based coupled scientific workflows
Journal of Parallel and Distributed Computing ( IF 3.8 ) Pub Date : 2020-07-24 , DOI: 10.1016/j.jpdc.2020.07.002
Tong Jin , Fan Zhang , Qian Sun , Melissa Romanus , Hoang Bui , Manish Parashar

Emerging scientific workflows running at extreme scale are composed of multiple applications that interact and exchange data at runtime. While staging-based approaches, e.g. in-situ/in-transit processing, are promising, dynamic behaviors (e.g. data volumes and distributions) in coupled applications and varying resource constraints at runtime make the efficient use of these techniques challenging. Addressing these challenges requires fundamental changes in the way that workflows are executed at runtime. Specifically, it is required to monitor the operating environment and running applications, and then adapt and tune the application behaviors and resource allocations at runtime while meeting the data management requirements and constraints. In this paper, we propose a policy-based autonomic data management (ADM) approach that can adaptively respond at runtime to dynamic data management requirements. We first formulate the schematic abstraction of this ADM approach including its conceptual model and system elements. Then, we explore the realization of ADM runtime and demonstrate how to achieve adaptations in a cross-layer manner with pre-defined autonomic policies. We also prototype our ADM approach and evaluate its performance on the Intrepid IBM-BlueGene and Titan Cray-XK7 systems using Chombo-based AMR applications and a visualization application. The experimental results demonstrate its effectiveness in meeting user defined objectives and accelerating overall scientific discovery.

中文翻译：

迈向基于阶段的耦合科学工作流的自主数据管理

以极端规模运行的新兴科学工作流由多个应用程序组成，这些应用程序在运行时进行交互和交换数据。尽管基于分阶段的方法（例如，原地/运输处理）很有希望，但耦合应用程序中的动态行为（例如，数据量和分布）以及运行时不断变化的资源限制使有效利用这些技术具有挑战性。解决这些挑战需要在运行时执行工作流的方式进行根本性的改变。特别是，需要监视操作环境和正在运行的应用程序，然后在满足数据管理要求和约束的同时，在运行时调整和调整应用程序的行为和资源分配。在本文中，我们提出了一种基于策略的自主数据管理（ADM）方法，该方法可以在运行时自适应地响应动态数据管理要求。我们首先制定此ADM方法的示意图抽象，包括其概念模型和系统元素。然后，我们探索ADM运行时的实现，并演示如何使用预定义的自主策略以跨层方式实现自适应。我们还对ADM方法进行了原型设计，并使用基于Chombo的AMR应用程序和可视化应用程序在Intrepid IBM-BlueGene和Titan Cray-XK7系统上评估了其性能。实验结果证明了其在满足用户定义的目标和加速整体科学发现方面的有效性。我们首先制定这种ADM方法的示意图抽象，包括其概念模型和系统元素。然后，我们探索ADM运行时的实现，并演示如何使用预定义的自主策略以跨层方式实现自适应。我们还对ADM方法进行了原型设计，并使用基于Chombo的AMR应用程序和可视化应用程序在Intrepid IBM-BlueGene和Titan Cray-XK7系统上评估了其性能。实验结果证明了其在满足用户定义的目标和加速整体科学发现方面的有效性。我们首先制定此ADM方法的示意图抽象，包括其概念模型和系统元素。然后，我们探索ADM运行时的实现，并演示如何使用预定义的自主策略以跨层方式实现自适应。我们还对ADM方法进行了原型设计，并使用基于Chombo的AMR应用程序和可视化应用程序在Intrepid IBM-BlueGene和Titan Cray-XK7系统上评估了其性能。实验结果证明了其在满足用户定义的目标和加速整体科学发现方面的有效性。我们还对ADM方法进行了原型设计，并使用基于Chombo的AMR应用程序和可视化应用程序在Intrepid IBM-BlueGene和Titan Cray-XK7系统上评估了其性能。实验结果证明了其在满足用户定义的目标和加速整体科学发现方面的有效性。我们还对ADM方法进行了原型设计，并使用基于Chombo的AMR应用程序和可视化应用程序在Intrepid IBM-BlueGene和Titan Cray-XK7系统上评估了其性能。实验结果证明了其在满足用户定义的目标和加速整体科学发现方面的有效性。

更新日期：2020-08-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>