GrayWulf: Scalable Software Architecture for Data Intensive Computing

Abstract

Big data presents new challenges to both cluster infrastructure software and parallel application design. We present a set of software services and design principles for data intensive computing with petabyte data sets, named GrayWulf†. These services are intended for deployment on a cluster of commodity servers similar to the well-known Beowulf clusters. We use the Pan-STARRS system currently under development as an example of the architecture and principles in action.

BibTeX key: Simmhan:msrtr:2008
entry type: techreport
year: 2008
month: September
institution: Microsoft Research
number: MSR-TR-2008-186
owner: Simmhan
url: http://research.microsoft.com/apps/pubs/default.aspx?id=79430
note: Extended version of HICSS 2009

BibSonomy

GrayWulf: Scalable Software Architecture for Data Intensive Computing

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on