Abstract
High-performance computing has recently seen a surge of
interest in heterogeneous systems, with an emphasis on modern
Graphics Processing Units (GPUs). These devices offer
tremendous potential for performance and efficiency in
important large-scale applications of computational
science. However, exploiting this potential can be
challenging, as one must adapt to the specialized and rapidly
evolving computing environment currently exhibited by GPUs.
One way of addressing this challenge is to embrace better
techniques and develop tools tailored to their needs. This
article presents one simple technique, GPU run-time code
generation (RTCG), along with PyCUDA and PyOpenCL, two
open-source toolkits that supports this technique. In
introducing PyCUDA and PyOpenCL, this article proposes the
combination of a dynamic, high-level scripting language with
the massive performance of a GPU as a compelling two-tiered
computing platform, potentially offering significant
performance and productivity advantages over conventional
single-tier, static systems. The concept of RTCG is simple
and easily implemented using existing, robust
infrastructure. Nonetheless it is powerful enough to support
(and encourage) the creation of custom application-specific
tools by its users. The premise of the paper is illustrated by
a wide range of examples where the technique has been applied
with considerable success.
Users
Please
log in to take part in the discussion (add own reviews or comments).