Logo Logo
Hilfe
Hilfe
Switch Language to English

Perdacher, Martin; Plant, Claudia und Boehm, Christian (2019): Cache-oblivious High-performance Similarity Join. In: Sigmod '19: Proceedings of the 2019 International Conference on Management of Data: S. 87-104

Volltext auf 'Open Access LMU' nicht verfügbar.

Abstract

A similarity join combines vectors based on a distance condition. Typically, such algorithms apply a filter step (by indexing or sorting) and then refine pairs of candidate vectors. In this paper, we propose to refine the pairs in an order defined by a space-filling curve which dramatically improves data locality. Modern multi-core processors are supported by a deep memory hierarchy including RAM, various levels of cache, and registers. The space-filling curve makes our proposed algorithm cache-oblivious to fully exploit this memory hierarchy. Our novel Fast General Form (FGF) Hilbert-curve solves a number of limitations of well-known space-filling curves: it is non-recursive, not restricted to traverse squares, and has a constant time and space complexity per loop iteration. As we demonstrate the easy transformation from conventional into cache-oblivious loops we believe that many complex database operators can be transformed systematically into cache-oblivious parallel algorithms.

Dokument bearbeiten Dokument bearbeiten