The main use cases for Spark are iterative Machine Learning algorithms and Interactive analytics. From the ML side -------------------- Most ML algorithms ru...
9781449327279 - Hadoop Operations - If you’ve been tasked with the job of maintaining large and complex Hadoop clusters, or are about to be, this book is a must. You’ll learn the particulars of Hadoop operations, from planning, installing, and configuring the system to providing ongoing maintenance.
Apache Sqoop(TM) is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.
Event-Detection - DBSCAN Algorithm in Map/Reduce logic, implemented with Hadoop and MongoDB, to analyze tweets and photos and to create geolocated events