Abstract
Increasingly, multi-core processors, multi-processor
nodes and multi-core, multi-processor nodes are finding
their way into compute clusters. Clusters built using such
nodes are already quite common and, inevitably, will become
more so over time. As with any new technology, however,
the potential benefits are seldom as easy to attain as we
expect them to be. In this paper, we explore three fundamental
issues related to the use of multi-core, multi-processor
nodes in compute clusters using MPI: inter-communication
(messaging) efficiency, cache effects (in particular processor
affinity) and initial process distribution. Based on some
initial experiments using a subset of theNAS parallel benchmarks
running on a small scale cluster with dual core, dual
processor nodes, we report results on the impact of these
issues. From these results we attempt to extrapolate some
simple, guidelines that are likely to be generally applicable
for optimizingMPI code running on clusters with multicore,
multi-processor nodes.
Links and resources
Tags
community