HEP-Frame: Improving the efficiency of pipelined data transformation & filtering for scientific analyses,Computer Physics Communications

当前位置： X-MOL 学术 › Comput. Phys. Commun. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

HEP-Frame: Improving the efficiency of pipelined data transformation & filtering for scientific analyses
Computer Physics Communications ( IF 7.2 ) Pub Date : 2021-01-28 , DOI: 10.1016/j.cpc.2021.107844
André Pereira , Alberto Proença

Software to analyse very large sets of experimental data often relies on a pipeline of irregular computational tasks with decisions to remove irrelevant data from further processing. A user-centred framework was designed and deployed, HEP-Frame, which aids domain experts to develop applications for scientific data analyses and to monitor and control their efficient execution. The key feature of HEP-Frame is the performance portability of the code across different heterogeneous platforms, due to a novel adaptive multi-layer scheduler, seamlessly integrated into the tool, an approach not available in competing frameworks.

The multi-layer scheduler transparently allocates parallel data/tasks across the available heterogeneous resources, dynamically balances threads among data input and computational tasks, adaptively reorders in run-time the parallel execution of the pipeline stages for each data stream, respecting data dependencies, and efficiently manages the execution of library functions in accelerators. Each layer implements a specific scheduling strategy: one balances the execution of the computational stages of the pipeline, distributing the execution of the stages of the same or different dataset elements among the available computing threads; another controls the order of the pipeline stages execution, so that most data is filtered out earlier and later stages execute the computationally heavy tasks; yet another adaptively balances the automatically created threads among data input and the computational tasks, taking into account the requirements of each application.

Simulated data analyses from sensors in the ATLAS Experiment at CERN evaluated the scheduler efficiency, on dual multicore Xeon servers with and without accelerators, and on servers with the many-core Intel KNL. Experimental results show significant improved performance of these data analyses due to HEP-Frame features and the codes scaled well on multiple servers. Results also show the improved HEP-Frame scheduler performance over the key competitor, the HEFT list scheduler.

The best overall performance improvement over a real fine tuned sequential data analysis was impressive in both homogeneous and heterogeneous multicore servers and in many-core servers: 81x faster in the homogeneous 24+24 core Skylake server, 86x faster in the heterogeneous 12+12 core Ivy Bridge server with the Kepler GPU, and 252x faster in the 64-core KNL server.

Program summary

Program Title: HEP-Frame

CPC Library link to program files: http://dx.doi.org/10.17632/m2jwxshtfz.1

Licencing provisions: GPLv3

Programming language: C++.

Supplementary material: The current HEP-Frame public release available at https://bitbucket.org/ampereira/hep-frame/wiki/Home.

Nature of problem: Scientific data analysis applications are often developed to process large amounts of data obtained through experimental measurements or Monte Carlo simulations, aiming to identify patterns in the data or to test and/or validate theories. These large inputs are usually processed by a pipeline of computational tasks that may filter out irrelevant data (a task and its filter is addressed as a proposition in this communication), preventing it from being processed by subsequent tasks in the pipeline.

This data filtering, coupled with the fact that propositions may have different computational intensities, contribute to the irregularity of the pipeline execution. This can lead to scientific data analyses I/O-, memory-, or compute-bound performance limitations, depending on the implemented algorithms and input data. To allow scientists to process more data with more accurate results their code and data structures should be optimized for the computing resources they can access. Since the main goal of most scientists is to obtain results relevant to their scientific fields, often within strict deadlines, optimizing the performance of their applications is very time consuming and is usually overlooked. Scientists require a software framework to aid the design and development of efficient applications and to control their parallel execution on distinct computing platforms.

Solution method: This work proposes HEP-Frame, a framework to aid the development and efficient execution of pipelined scientific analysis applications on homogeneous and heterogeneous servers. HEP-Frame is a user-centred framework to aid scientists to develop applications to analyse data from a large number of dataset elements, with a flexible pipeline of propositions. It not only stresses the interface to domain experts so that code is more robust and is developed faster, but it also aims high-performance portability across different types of parallel computing platforms and desirable sustainability features. This framework aims to provide efficient parallel code execution without requiring user expertise in parallel computing.

Frameworks to aid the design and deployment of scientific code usually fall into two categories: (i) resource-centred, closer to the computing platforms, where execution efficiency and performance portability are the main goals, but forces developers to adapt their code to strict framework constraints; (ii) user-centred, which stresses the interface to domain experts to improve their code development speed and robustness, aiming to provide desirable sustainability features but disregarding the execution performance. There are also a set of frameworks that merge these two categories (Liu et al., 2015 [1]; Deelman et al., 2015 [2]) for scientific computing. While they do not have steep learning curves, concessions have to be made to their ease of use to allow for their broader scope of targeted applications. HEP-Frame attempts to merge this gap, placing itself between a fully user- or resource-centred framework, so that users develop code quickly and do not have to worry about the computational efficiency of the code It handles (i) by ensuring efficient execution of applications according to their computational requirements and the available resources on the server through a multi-layer scheduler, while (ii) is addressed by automatically generating code skeletons and transparently managing the data structure and automating repetitive tasks.

Additional comments: An early stage proof-of-concept was published in a conference proceedings (Pereira et al., 2015). However, the HEP-Frame version presented in this communication only shares a very small portion of the code related to the skeleton generation (less than 5% of the overall code), while the rest of the user interface, multi-layer scheduler, and parallelization strategies were completely redesigned and re-implemented.

中文翻译：

HEP框架：提高用于科学分析的流水线数据转换和过滤的效率

用于分析大量实验数据的软件通常依赖于不规则计算任务的流水线，并具有从进一步处理中删除不相关数据的决策。设计并部署了以用户为中心的框架HEP-Frame，该框架可帮助领域专家开发用于科学数据分析的应用程序，并监视和控制其有效执行。HEP-Frame的关键特征是代码在不同异构平台上的性能可移植性，这归功于一种新颖的自适应多层调度程序，它无缝地集成到工具中，这种方法在竞争性框架中是不可用的。

多层调度程序透明地在可用的异构资源之间分配并行数据/任务，动态平衡数据输入和计算任务之间的线程，在运行时自适应地对每个数据流的并行执行并行重新排序，同时遵守数据依赖性，并有效管理加速器中库函数的执行。每一层实施一种特定的调度策略：平衡流水线计算阶段的执行，在可用的计算线程之间分配相同或不同数据集元素的阶段的执行；另一个控制流水线级执行的顺序，以便大多数数据在早期被过滤掉，而后级执行计算量大的任务；考虑到每个应用程序的需求，另一种方法可以在数据输入和计算任务之间自适应地平衡自动创建的线程。

在CERN的ATLAS实验中，来自传感器的模拟数据分析评估了具有或不具有加速器的双多核Xeon服务器以及具有多核Intel KNL的服务器上的调度程序效率。实验结果表明，由于具有HEP-Frame功能，并且代码可在多台服务器上很好地扩展，因此这些数据分析的性能显着提高。结果还显示，与主要竞争对手HEFT列表调度程序相比，HEP-Frame调度程序性能有所提高。

在同质和异构多核服务器以及多核服务器中，与真正的微调顺序数据分析相比，最佳的整体性能改进令人印象深刻：同质24 + 24核心Skylake服务器快81倍，异构12 + 12核心快86倍具有开普勒GPU的Ivy Bridge服务器，在64核KNL服务器中的速度提高了252倍。

计划摘要

节目名称：HEP框架

CPC库链接到程序文件： http : //dx.doi.org/10.17632/m2jwxshtfz.1

许可条款： GPLv3

编程语言： C ++。

补充材料：当前的HEP-Frame公共发行版可从https://bitbucket.org/ampereira/hep-frame/wiki/Home获得。

问题的性质：科学数据分析应用程序经常被开发来处理通过实验测量或蒙特卡洛模拟获得的大量数据，旨在识别数据中的模式或测试和/或验证理论。这些较大的输入通常由可能会过滤掉不相关数据的计算任务流水线处理（任务及其过滤器在此通信中被视为命题），从而阻止了管道中的后续任务对其进行处理。

这种数据过滤以及命题可能具有不同的计算强度这一事实，导致流水线执行的不规则性。根据实施的算法和输入数据，这可能导致科学数据分析I / O，内存或计算限制的性能限制。为了使科学家能够处理更多数据并获得更准确的结果，应针对他们可以访问的计算资源优化其代码和数据结构。由于大多数科学家的主要目标是经常在严格的期限内获得与其科学领域相关的结果，因此优化其应用程序的性能非常耗时，并且通常被忽略。

解决方法：这项工作提出了HEP框架，该框架可帮助在同质和异构服务器上开发和有效执行流水线科学分析应用程序。HEP-Frame是一个以用户为中心的框架，可帮助科学家开发应用程序，以灵活的命题管道分析来自大量数据集元素的数据。它不仅强调了与领域专家的接口，从而使代码更健壮且开发速度更快，而且还旨在跨不同类型的并行计算平台和所需的可持续性功能实现高性能的可移植性。该框架旨在提供有效的并行代码执行，而无需用户在并行计算方面的专业知识。

有助于科学代码设计和部署的框架通常分为两类：（i）以资源为中心，更靠近计算平台，在该平台上，执行效率和性能可移植性是主要目标，但迫使开发人员将其代码适应严格的框架约束（ii）以用户为中心，这强调了与领域专家的接口，以提高他们的代码开发速度和健壮性，旨在提供理想的可持续性功能，但不考虑执行性能。还有一组框架将这两个类别合并（Liu等人，2015 [1]； Deelman等人，2015 [2]）以进行科学计算。尽管它们没有陡峭的学习曲线，但必须对它们的易用性做出让步，以允许其更广泛的目标应用范围。HEP-Frame试图弥补这一差距，

补充说明：早期概念验证已在会议记录中发表（Pereira等，2015）。但是，此通信中呈现的HEP-Frame版本仅共享与框架生成相关的代码中的一小部分（不到总代码的5％），而其余用户界面，多层调度程序和并行化策略已完全重新设计和实施。

更新日期：2021-02-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11