Dev Hub

Dynamic Power Management Techniques for Modern Heterogeneous HPC Systems

Abstract

The widespread adoption of Heterogeneous Architectures in the context of High Performance Computing (HPC) brings many challenges and opportunities for power delivery and management. In this talk, we explore recent innovations co-designed across the Hardware and Software layers developed to help characterize workload behavior and to perform automated, nuanced optimizations of the underlying system for maximum performance and efficiency.​ This work empirically demonstrates job energy savings with negligible performance degradation on a variety of customer relevant workloads using a combination of Software and Hardware telemetry feeding adaptive heuristics which dynamically modulate power and frequency limits across the most power-hungry compute elements of the target systems (including CPUs and GPUs) using GEOPM, an open-source, runtime framework focused on facilitating the adoption and exploration of power management techniques and related hardware features. ​This work also demonstrates practical applications for HPC of the Intel® Speed Select Technology (SST) on 3rd Gen Intel® Xeon® Scalable processor to increase performance of workloads that expose some form of load imbalance, even when running at full TDP (Thermal Design Point). ​Finally, we discuss Software Architecture considerations of the GEOPM framework designed to facilitate the collaboration and integration with third party tools and the support of a variety of hardware platforms.

Speaker

Federico Ardanaz, Power Architect and Principal Engineer, GEOPM, Intel Corporation