U. Bawidamann, and M. Nehmeier. Parallel Processing and Applied Mathematics, volume 7204 of Lecture Notes in Computer Science, Springer Berlin / Heidelberg, 10.1007/978-3-642-31500-8_8.(2012)
Abstract
In this paper we discuss the interaction of expression templates with OpenCL devices. We show how the expression tree of expression templates can be used to generate problem specific OpenCL kernels. In a second approach we use expression templates to optimize the data transfer between the host and the device which leads to a measurable performance increase in a domain specific language approach. We tested the functionality, correctness and performance for both implementations in a case study for vector and matrix operations.
%0 Book Section
%1 nehmeierPPAM2011
%A Bawidamann, Uwe
%A Nehmeier, Marco
%B Parallel Processing and Applied Mathematics
%D 2012
%E Wyrzykowski, Roman
%E Dongarra, Jack
%E Karczewski, Konrad
%E Wasniewski, Jerzy
%I Springer Berlin / Heidelberg
%K bookPub info2 rari
%P 71-80
%T Expression Templates and OpenCL
%U http://dx.doi.org/10.1007/978-3-642-31500-8_8
%V 7204
%X In this paper we discuss the interaction of expression templates with OpenCL devices. We show how the expression tree of expression templates can be used to generate problem specific OpenCL kernels. In a second approach we use expression templates to optimize the data transfer between the host and the device which leads to a measurable performance increase in a domain specific language approach. We tested the functionality, correctness and performance for both implementations in a case study for vector and matrix operations.
%@ 978-3-642-31499-5
@incollection{nehmeierPPAM2011,
abstract = {In this paper we discuss the interaction of expression templates with OpenCL devices. We show how the expression tree of expression templates can be used to generate problem specific OpenCL kernels. In a second approach we use expression templates to optimize the data transfer between the host and the device which leads to a measurable performance increase in a domain specific language approach. We tested the functionality, correctness and performance for both implementations in a case study for vector and matrix operations.},
added-at = {2012-07-06T10:19:04.000+0200},
affiliation = {Institute of Computer Science, University of Würzburg Am Hubland, D 97074 Würzburg, Germany},
author = {Bawidamann, Uwe and Nehmeier, Marco},
biburl = {https://www.bibsonomy.org/bibtex/204e4bc1c615f1507f9761f0a4528e678/nehmeier},
booktitle = {Parallel Processing and Applied Mathematics},
editor = {Wyrzykowski, Roman and Dongarra, Jack and Karczewski, Konrad and Wasniewski, Jerzy},
interhash = {474af7f8b57ea26d3349b54298d454df},
intrahash = {04e4bc1c615f1507f9761f0a4528e678},
isbn = {978-3-642-31499-5},
keyword = {Computer Science},
keywords = {bookPub info2 rari},
note = {10.1007/978-3-642-31500-8_8},
pages = {71-80},
publisher = {Springer Berlin / Heidelberg},
series = {Lecture Notes in Computer Science},
timestamp = {2014-03-24T14:44:13.000+0100},
title = {Expression Templates and OpenCL},
url = {http://dx.doi.org/10.1007/978-3-642-31500-8_8},
volume = 7204,
year = 2012
}