Abstract
Optimal utilization of power is a major concern for HPC, and is one of the focus points on the path towards exascale and approaches range from chip level to facility wide solutions. In order to evaluate the implications of these approaches and their impact on future system design, we need to understand their interaction with applications as well as their performance impact. In this work we describe the GREMLIN framework, a general framework to emulate system changes on existing platforms by resource restriction or event injection. We use this framework to understand the behavior of applications executed on power limited systems and to evaluate a solution for one of the problems resulting from operating under a power limit: the translation of manufacturing variability into heterogeneous performance, as observed in power limited HPC environments. We show that in a power limited environment manufacturing variability is a key source of performance imbalances and thus non-optimal execution. We propose a Power Balancer for redistribution of unused power and show performance gains of up to 1.5% at small to medium node counts.
Item Type: | Conference or Workshop Item (Report) |
---|---|
Faculties: | Mathematics, Computer Science and Statistics > Computer Science |
Subjects: | 000 Computer science, information and general works > 004 Data processing computer science |
ISSN: | 2164-7062 |
Language: | English |
Item ID: | 47379 |
Date Deposited: | 27. Apr 2018, 08:12 |
Last Modified: | 13. Aug 2024, 12:54 |