Abstract
Optimal utilization of power is a major concern for HPC, and is one of the focus points on the path towards exascale and approaches range from chip level to facility wide solutions. In order to evaluate the implications of these approaches and their impact on future system design, we need to understand their interaction with applications as well as their performance impact. In this work we describe the GREMLIN framework, a general framework to emulate system changes on existing platforms by resource restriction or event injection. We use this framework to understand the behavior of applications executed on power limited systems and to evaluate a solution for one of the problems resulting from operating under a power limit: the translation of manufacturing variability into heterogeneous performance, as observed in power limited HPC environments. We show that in a power limited environment manufacturing variability is a key source of performance imbalances and thus non-optimal execution. We propose a Power Balancer for redistribution of unused power and show performance gains of up to 1.5% at small to medium node counts.
Dokumententyp: | Konferenzbeitrag (Bericht) |
---|---|
Fakultät: | Mathematik, Informatik und Statistik > Informatik |
Themengebiete: | 000 Informatik, Informationswissenschaft, allgemeine Werke > 004 Informatik |
ISSN: | 2164-7062 |
Sprache: | Englisch |
Dokumenten ID: | 47379 |
Datum der Veröffentlichung auf Open Access LMU: | 27. Apr. 2018, 08:12 |
Letzte Änderungen: | 13. Aug. 2024, 12:54 |