Disco is an open-source implementation of the Map-Reduce framework for distributed computing. As the original framework, Disco supports parallel computations over large data sets on unreliable cluster of computers.
Sqoop is a tool designed to import data from relational databases into Hadoop. Sqoop uses JDBC to connect to a database. It examines each table’s schema and automatically generates the necessary classes to import data into the Hadoop Distributed File System (HDFS). Sqoop then creates and launches a MapReduce job to read tables from the database via DBInputFormat, the JDBC-based InputFormat. Tables are read into a set of files in HDFS. Sqoop supports both SequenceFile and text-based target and includes performance enhancements for loading data from MySQL.
MRQL (the Map-Reduce Query Language) is an SQL-like query language for map-reduce computations. It is implemented on top of Apache's Hadoop. MRQL is powerful enough to express most common data analysis tasks over many different kinds of raw data, including hierarchical data and nested collections, such as XML data. It is more powerful than other current languages, such as Hive and Pig Latin, since it can operate on more complex data and supports more powerful query constructs, thus eliminating the need for using explicit map-reduce code.
C. Bellettini, M. Camilli, L. Capra, und M. Monga. Reachability Problems, Volume 8169 von Lecture Notes in Computer Science, Springer Berlin Heidelberg, (2013)
K. Rohloff, und R. Schantz. Proceedings of the fourth international workshop on Data-intensive distributed computing, Seite 35--44. New York, NY, USA, ACM, (2011)
A. Ghoting, P. Kambadur, E. Pednault, und R. Kannan. Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, August 21-24, 2011, Seite 334-342. (2011)
T. Sandholm, und K. Lai. SIGMETRICS '09: Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems, Seite 299--310. New York, NY, USA, ACM, (2009)
F. Chierichetti, R. Kumar, und A. Tomkins. WWW '10: Proceedings of the 19th international conference on World wide web, Seite 231--240. New York, NY, USA, ACM, (2010)
J. Urbani, S. Kotoulas, E. Oren, und F. van Harmelen. International Semantic Web Conference, Volume 5823 von Lecture Notes in Computer Science, Seite 634-649. Springer, (2009)
H. chih Yang, A. Dasdan, R. Hsiao, und D. Parker. SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, Seite 1029--1040. New York, NY, USA, ACM, (2007)
P. Pantel, E. Crestan, A. Borkovsky, A. Popescu, und V. Vyas. EMNLP '09: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Seite 938--947. Morristown, NJ, USA, Association for Computational Linguistics, (2009)
H. chih Yang, A. Dasdan, R. Hsiao, und D. Parker. SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, Seite 1029--1040. New York, NY, USA, ACM, (2007)
J. Lin. SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, Seite 155--162. New York, NY, USA, ACM, (2009)
P. Ravindra, V. Deshpande, und K. Anyanwu. MDAC '10: Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud, Seite 1--6. New York, NY, USA, ACM, (2010)
G. Sadasivam, und G. Baktavatchalam. MDAC '10: Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud, Seite 1--7. New York, NY, USA, ACM, (2010)
F. Chierichetti, R. Kumar, und A. Tomkins. WWW '10: Proceedings of the 19th international conference on World wide web, Seite 231--240. New York, NY, USA, ACM, (2010)
R. Cordeiro, C. Jr., A. Traina, J. López, U. Kang, und C. Faloutsos. Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, August 21-24, 2011, Seite 690-698. ACM, (2011)
M. Becker, H. Mewes, A. Hotho, D. Dimitrov, F. Lemmerich, und M. Strohmaier. International Conference Companion on World Wide Web, Seite 17--18. Republic and Canton of Geneva, Switzerland, International World Wide Web Conferences Steering Committee, (2016)
J. Dean, und S. Ghemawat. OSDI'04: Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation, Seite 10--10. Berkeley, CA, USA, USENIX Association, (2004)