@gron

Scalability Evaluation of Barrier Algorithms for OpenMP

, , , and . Evolving OpenMP in an Age of Extreme Parallelism, volume 5568 of Lecture Notes in Computer Science, page 42--52. Springer, (June 2009)

Abstract

OpenMP relies heavily on barrier synchronization to coordinate the work of threads that are performing the computations in a parallel region. A good implementation of barriers is thus an important part of any implementation of this API. As the numberof cores in shared and distributed shared memory machines continues to grow, the quality of the barrier implementation iscritical for application scalability. There are a number of known algorithms for providing barriers in software. In this paper,we consider some of the most widely used approaches for implementing barriers on large-scale shared-memory multiprocessorsystems: a ”blocking” implementation that de-schedules a waiting thread, a ”centralized” busy wait and three forms of distributed”busy” wait implementations are discussed. We have implemented the barrier algorithms in the runtime library associated witha research compiler, OpenUH. We first compare the impact of these algorithms on the overheads incurred for OpenMP constructsthat involve a barrier, possibly implicitly. We then show how the different barrier implementations influence the performanceof two different OpenMP application codes.

Links and resources

Tags

community