Abstract
Big data presents new challenges to both cluster infrastructure software
and parallel application design. We present a set of software services
and design principles for data intensive computing with petabyte
data sets, named GrayWulf†. These services are intended for deployment
on a cluster of commodity servers similar to the well-known Beowulf
clusters. We use the Pan-STARRS system currently under development
as an example of the architecture and principles in action.
Users
Please
log in to take part in the discussion (add own reviews or comments).