Peregrine is a map reduce framework designed for running iterative jobs across partitions of data. Peregrine is designed to be FAST for executing map reduce jobs by supporting a number of optimizations and features not present in other map reduce frameworks.
MRQL (the Map-Reduce Query Language) is an SQL-like query language for map-reduce computations. It is implemented on top of Apache's Hadoop. MRQL is powerful enough to express most common data analysis tasks over many different kinds of raw data, including hierarchical data and nested collections, such as XML data. It is more powerful than other current languages, such as Hive and Pig Latin, since it can operate on more complex data and supports more powerful query constructs, thus eliminating the need for using explicit map-reduce code.
A list of Group papers for MapReduce Applications. Articles include: 'Nephele: Genotyping via Complete Composition Vectors and MapReduce' by Marc E Colosimo, Matthew W Peterson, Scott Mardis et al., 'Clustering Very Large Multi-dimensional Datasets with MapReduce' by Robson L F Cordeiro, Julio López, Christos Faloutsos and 'Yahoo! Research Small World Experiment' by Yahoo!, Facebook
D. Hiemstra, und C. Hauff. Multilingual and Multimodal Information Access Evaluation, Volume 6360 von Lecture Notes in Computer Science, Seite 64--69. Berlin, Springer Verlag, (2010)
P. Pantel, E. Crestan, A. Borkovsky, A. Popescu, und V. Vyas. EMNLP '09: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Seite 938--947. Morristown, NJ, USA, Association for Computational Linguistics, (2009)
H. chih Yang, A. Dasdan, R. Hsiao, und D. Parker. SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, Seite 1029--1040. New York, NY, USA, ACM, (2007)
F. Chierichetti, R. Kumar, und A. Tomkins. WWW '10: Proceedings of the 19th international conference on World wide web, Seite 231--240. New York, NY, USA, ACM, (2010)
F. Chierichetti, R. Kumar, und A. Tomkins. WWW '10: Proceedings of the 19th international conference on World wide web, Seite 231--240. New York, NY, USA, ACM, (2010)
P. Ravindra, V. Deshpande, und K. Anyanwu. MDAC '10: Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud, Seite 1--6. New York, NY, USA, ACM, (2010)
T. Sandholm, und K. Lai. SIGMETRICS '09: Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems, Seite 299--310. New York, NY, USA, ACM, (2009)
J. Lin. SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, Seite 155--162. New York, NY, USA, ACM, (2009)