Abstract
Modern multicore architectures have become popular because of the limitations of deep pipelines
and heating and power concerns. Some of these multicore architectures such as the Intel Xeon have
the ability to run several threads on a single core. The OpenMP standard for compiler directive based
shared memory programming allows the developer an easy path to writing multithreaded programs and
is a natural t for multicore architectures. The OpenMP standard uses loop parallelism as a basis for
work division among multiple threads. These loops usually use arrays in their computation with different
data distributions and access patterns. The performance of accesses to these arrays may be dependent
on the underlying page size depending on the frequency and strides of these accesses. In this paper, we
discuss the issues and potential benets from using large pages for OpenMP applications. We design
an OpenMP implementation capable of using large pages and evaluate the impact of using large page
support available in most modern processors on the performance and scalability of parallel OpenMP
applications. Results show an improvement in performance of up to 25% for some applications. It also
helps improve the scalability of these applications
Users
Please
log in to take part in the discussion (add own reviews or comments).