Dynamic power management for value-oriented schedulers in power-constrained HPC system☆
Introduction
Improving the productivity of a high performance computing (HPC) system has been a non-trivial challenge since the inception of the petascale computing. As the modern HPC system is heading towards the exascale era, the design complexity of the solutions for the HPC productivity challenge is compounded with the additional constraint on the system-wide power consumption [1]. Traditionally, the performance of an HPC system is defined in terms of floating-point operations-per-second (FLOPS). However, numerous studies have shown that the flops is an inadequate metric to define the HPC productivity [2], [3], [4], [5], [6], [7]. As an alternate, these studies define HPC productivity in terms of the importance or value of the outputs generated on the system at the time of completion for each application. To measure the importance of an application output, a time-dependent value (utility) function is assigned to each job [2], [3], [4]. Several researchers have adapted this time-dependent value function definition and proposed value-based resource management heuristics for an oversubscribed system to maximize HPC productivity [6], [7], [8], [9], [10], [11], [12], [13]. However, none of these approaches consider the system-wide power as a constrained resource.
The U.S. Department of Energy has mandated to operate a future exascale system under a strict power budget of 20MW - 30MW to support efficient electricity generation and distribution, and to keep the operational cost of an exascale computing system manageable [1]. This creates a necessity to design new value-based resource management algorithms that address the productivity challenge for a power-constrained system. In our earlier work, we explored various static power allocation strategies for two value-based algorithms (value-per-time (VPT) [14], [15] and value-per-energy (VPE) [12]). The static power allocation refers to a fixed allocation of power-budget at the start of a job execution on the HPC. To best of our knowledge, these are the only studies that address the productivity challenge in a power-constrained environment. However, the static power allocation strategies suffer from the resource under-utilization because, after the completion of a job, the freed resources are left unused in the absence of a new job. Furthermore, the static policies lead to inefficient utilization of the resources as once a low value-job is scheduled in the system, a newly arrived high-value job may starve for resources if the system does not have sufficient idle nodes or power to execute the new job. To overcome these disadvantages of the static allocation strategies, in this work, we propose a novel dynamic power reallocation strategy for the value-based algorithms. The dynamic term corresponds to the alterations in the job’s power-budget during its execution on the HPC. The contributions of this study are:
- •
We propose a novel strategy to dynamically modify the power budget of the running jobs to maximize the productivity of the value-based algorithms.
- •
We propose modifications in the traditional static VPE and VPT algorithms by adapting our dynamic strategy and demonstrate improvement in the HPC productivity compared to their static variants.
- •
We propose a low-overhead offline-modeling approach to create power-execution time models for the HPC applications. These models are necessary to implement our power-aware value-based algorithms.
- •
We expose the advantages and disadvantages of the static power-aware VPE and VPT algorithms under different system-wide power constraints.
- •
We demonstrate the superiority of our dynamic strategy in improving productivity, resource utilization, and job completion rate against the state-of-the-art static algorithms based on real HPC workload traces.
The preliminary version of this work has appeared in the “Parallel and Distributed Computing, Applications, and Technologies (PDCAT 2019)” conference [16]. In the preliminary version, we introduced the concept of dynamic power allocation strategy for the VPT algorithm and demonstrated its strength in improving system-productivity and the resource utilization compared to the static power-aware VPT algorithms [15]. Our evaluation was limited to synthetic workloads. In this paper, we expand on our preliminary version with the following contributions:
- •
We introduce the first dynamic implementation of the VPE algorithm based on our dynamic strategy.
- •
We expand our analysis to include dynamic-VPE with respect to dynamic-VPT and state-of-the-art static algorithms. We show that not only the VPT algorithm but also the VPE algorithm benefits from our dynamic strategy as power becomes a limited resource.
- •
We mathematically formalize the decisions of all the static and dynamic algorithms used in this study.
- •
We enhance our simulation environment to include real HPC workloads and validate the findings of our preliminary work based on synthetic workloads.
- •
We perform an in-depth analysis of our algorithms and their impact on different performance metrics such as productivity, resource utilization, scheduling overhead, and job completion rate.
- •
We evaluate the impact of different power-allocation strategies on the individual job execution time and energy consumption under different system-wide power constraints.
The rest of this paper is organized as follows. We provide an overview of our job-value function and present literature review in Section 2. We present the details of our target HPC environment and mathematically describe the objective function in Section 3. Next, we present the steps for creating power-execution time models for the target environment in Section 4. We describe and formulate the static power-aware value-based algorithms and introduce our novel dynamic power allocation strategy for the VPT and VPE algorithms in Section 5. We present our experimental setup and performance evaluation in Section 6. Finally, we conclude our study and discuss planned future work in Section 7.
Section snippets
Background and related work
A large number of users are migrating to the HPC systems as the computation requirements for the applications are increasing with the explosion in the data. In a traditional HPC environment, a user submits an application as a job with a fixed priority and then waits for the completion of the job. In this traditional approach of job submission, it is common for a user-submitted job to wait in the resource allocation queue for a duration that is significantly longer than its actual execution time
HPC environment model
In this work, we model an HPC system composed of homogeneous nodes, and each node is composed of one or more computing units (multi-core CPU), memory, secondary storage, and network interface card. Each compute unit has instrumentation to monitor and control its power consumption. The power consumption of the remaining components is excluded from the system-wide power budget. Based on the system-wide power budget determined by the system administrator, the resource manager relies on the
Power-execution time model
Predicting the execution time of an application under given resource constraints is a critical step in value-based algorithms to make informed scheduling decisions. In the literature, such predictions are made either using historical data [6], [12], [18] or application performance models [14], [23], [33], [34]. In this work, we choose to create application-specific performance models for two reasons. First, the workload analysis by Antypas et al. on the Hopper production system at the National
Overview
In our previous work [14], [15], [39], we used the value-per-time (VPT) [6] algorithm to explore various power allocation strategies. We refer to this algorithm as the baseline value-per-time (or baseline-VPT). In the baseline-VPT, the scheduling decisions for the waiting jobs are made at the occurrence of a mapping event. In value-based algorithms, a mapping event corresponds to the instance when scheduling decisions are made on the waiting or newly arrived jobs. A mapping event triggers the
Overview
We utilize an in-house simulation environment to simulate an HPC composed of 2048 nodes and conduct our evaluations based on realistic workload trace of jobs generated on the BlueGene/L system [41], where we simulate job arrivals in the system. Each node in the system contains two compute units (Intel Xeon E5-2695 v2). We limit the minimum () and maximum () power consumption for each compute unit to 60 and 115 W, respectively, as per the specification of the selected CPU. We create 30
Conclusion and future work
In this study, we introduce a novel dynamic power allocation strategy that successfully rearranges the distribution of the system-wide power among the jobs to improve the productivity, resource utilization, and job completion rate in a power-constrained HPC system. By using the real HPC workloads, we successfully demonstrate that the algorithms used for the dynamic strategies are consistently superior than their static variants under different power constraints.
In our simulations, we represent
CRediT authorship contribution statement
Nirmal Kumbhare: Conceptualization, Methodology, Software, Validation, Investigation, Visualization, Writing - original draft. Ali Akoglu: Conceptualization, Investigation, Validation, Writing - original draft, Supervision. Aniruddha Marathe: Formal analysis, Writing - review & editing. Salim Hariri: Writing - review & editing. Ghaleb Abdulla: Writing - review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (41)
- et al.
Value of service based resource management for large-scale computing systems
Clust. Comput.
(2017) - X. Xu, W. Dou, X. Zhang, J. Chen, EnReal: an energy-aware resource allocation method for scientific workflow executions...
- et al.
Practical resource management in power-constrained, high performance computing
24th Int. Symposium on High-Performance Parallel and Distributed Computing (HPDC)
(2015) - R. Lucas, J. Ang, K. Bergman, S. Borkar, W. Carlson, L. Carrington, G. Chiu, R. Colwell, W. Dally, J. Dongarra, Top ten...
- et al.
A framework for measuring supercomputer productivity
Int. J. High Perform. Comput. Appl.
(2004) - et al.
Measuring high performance computing productivity
Int. J. High Perform. Comput. Appl.
(2004) High performance computing productivity model synthesis
Int. J. High Perform. Comput. Appl.
(2004)Productivity metrics and models for high performance computing
Int. J. High Perform. Comput. Appl.
(2004)- et al.
Utility functions and resource management in an oversubscribed heterogeneous computing environment
IEEE Trans. Comput.
(2015) - et al.
On recent advances in time/utility function real-time scheduling and resource management
8th IEEE Int. Symposium on Object-Oriented Real-Time Distributed Computing (ISORC)
(2005)
A time-driven scheduling model for real-time operating systems
6th IEEE Real-Time Systems Symposium (RTSS)
A scheduling algorithm for tasks described by time value function
Real-Time Syst.
Performance optimization based on analytical modeling in a real-time system with constrained time/utility functions
IEEE Trans. Comput.
Precise and realistic utility functions for user-centric performance analysis of schedulers
16th Int. Symposium on High Performance Distributed Computing (HPDC)
Utility-based resource management in an oversubscribed energy-constrained heterogeneous environment executing parallel applications
Parallel Comput.
A value-oriented job scheduling approach for power-constrained and oversubscribed HPC systems
IEEE Trans. Parallel Distrib. Syst.
Value based scheduling for oversubscribed power-constrained homogeneous HPC systems
International Conference on Cloud and Autonomic Computing (ICCAC)
Adaptive power reallocation for value-oriented schedulers in power-constrained HPC
Parallel and Distributed Computing, Applications and Technologies (PDCAT)
What’s working in HPC: Investigating HPC user behavior and productivity
Value-based resource management in high-performance computing systems
7th Workshop on Scientific Cloud Computing
Cited by (0)
- ☆
This work is partly supported by National Science Foundation (NSF) research projects NSF CNS-1624668. A part of this work is also performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 (LLNL-JRNL-780060).