Logo Logo
Switch Language to German
Kowalewski, Roger; Jungblut, Pascal; Fuerlinger, Karl (2019): Engineering a Distributed Histogram Sort. In: 2019 Ieee International Conference on Cluster Computing (Cluster): pp. 81-91
Full text not available from 'Open Access LMU'.


Sorting is one of the most critical non-numerical algorithms and covers use cases in a wide spectrum of scientific applications. Although we can build upon excellent research over the last decades, scaling to thousands of processing units on modern many-core architectures reveals a gap between theory and practice. We adopt ideas of the well-known quickselect and sample sort algorithms to minimize data movement. Our evaluation demonstrates that we can keep up with recently proposed distribution sort algorithms in large-scale experiments, without any assumptions on the input keys. Additionally, our implementation outperforms an efficient multi-threaded merge sort on a single node. Our implementation is based on a C++ PGAS approach with an STL-like interface and can easily be integrated into many application codes. As part of the presented experiments, we further reveal challenges with multi-threaded MPI and one-sided communication.