The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing, including:
* Hadoop Core, our flagship sub-project, provides a distributed filesystem (HDFS) and support for the MapReduce distributed computing framework.
* HBase builds on Hadoop Core to provide a scalable, distributed database.
* Pig is a high-level data-flow language and execution framework for parallel computation. It is built on top of Hadoop Core.
* ZooKeeper is a highly available and reliable coordination system. Distributed applications use ZooKeeper to store and mediate updates for critical shared state.
* Hive is a data warehouse infrastructure built on Hadoop Core that provides data summarization, adhoc querying and analysis of datasets.
Éric Piel, and A. Gonzalez-Sanchez. SINTER '09: Proceedings of the 2009 ESEC/FSE workshop on Software integration and evolution @ runtime, page 3--10. New York, NY, USA, ACM, (2009)