Аннотация

Recent developments have shown that classic database query execution techniques, such as the iterator model, are no longer optimal to leverage the features of modern hardware architectures. This is especially true for massive parallel architectures, such as many-core processors and GPUs. Here, the processing of single tuples in one step is not enough work to utilize the hardware resources and the cache efficiently and to justify the overhead introduced by iterators. To overcome these problems, we use just-in-time compilation to execute whole OLAP queries on the GPU minimizing the overhead for transfer and synchronization. We describe several patterns, which can be used to build efficient execution plans and achieve the necessary parallelism. Furthermore, we show that we can use similar processing models (and even the same source code) on GPUs and modern CPU architectures, but point out also some differences and limitations for query execution on GPUs. Results from our experimental evaluation using a TPC-H subset show that using these patterns we can achieve a speed-up of up to factor 5 compared to a CPU implementation.

Линки и ресурсы

тэги

сообщество