The parallelized large-eddy model code PALM, which runs at the Zuse Institute Berlin, is used by researchers to simulate the city climate with the goal of improving air quality in the German capital. For fine-granular resolutions of the application's compute domain, the memory footprint easily exceeds the capacity of a usual cluster node's main memory. Using Optane memory, the number of required cluster nodes for a particular simulation can be reduced. While using Optane's Memory Mode allows flawless usage of the large capacity with only small performance loss, the AppDirect Mode enables further optimizations. We show that a partitioning of the application between DRAM and Optane memory can speedup the essential Multigrid solver part of the application over the Memory Mode by up to 8%. We also note that the gained speedup is lost, when data must be transferred between the two memory types. The use case reveals the challenge of data partitioning and optimizing the data transfers between different memory types on heterogeneous memory systems.
Steffen Christgau is an HPC consultant at the Supercomputing Department of the Zuse Institute Berlin (Germany). His current research interests are the efficient usage of persistent memory for HPC applications as well as their optimization for new heterogeneous hardware platforms with established and new programming environments. He received his Ph.D. as well as his M.Sc. degree in computer science from the University of Potsdam, Germany. While working at the Operating Systems and Distributed Systems group, his research focused on designing and optimizing MPI implementations for an experimental, non-cache-coherent many-core processor. Before he joined ZIB, he also worked as a lecturer for parallel computing and in the industry on compiler implementations and robotic systems.