Abstract
A similarity join combines vectors based on a distance condition. Typically, such algorithms apply a filter step (by indexing or sorting) and then refine pairs of candidate vectors. In this paper, we propose to refine the pairs in an order defined by a space-filling curve which dramatically improves data locality. Modern multi-core processors are supported by a deep memory hierarchy including RAM, various levels of cache, and registers. The space-filling curve makes our proposed algorithm cache-oblivious to fully exploit this memory hierarchy. Our novel Fast General Form (FGF) Hilbert-curve solves a number of limitations of well-known space-filling curves: it is non-recursive, not restricted to traverse squares, and has a constant time and space complexity per loop iteration. As we demonstrate the easy transformation from conventional into cache-oblivious loops we believe that many complex database operators can be transformed systematically into cache-oblivious parallel algorithms.
Item Type: | Journal article |
---|---|
Faculties: | Mathematics, Computer Science and Statistics > Computer Science |
Subjects: | 000 Computer science, information and general works > 004 Data processing computer science |
ISSN: | 0730-8078 |
Language: | English |
Item ID: | 82286 |
Date Deposited: | 15. Dec 2021 15:01 |
Last Modified: | 15. Dec 2021 15:01 |