tag :: hadoop mapreduce apache

bookmarks (hide)7
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

3Apache Spark™ - Lightning-Fast Cluster Computing
https://spark.incubator.apache.org/
11 years ago by @nosebrain
show all tags
apache
cluster
computing
hadoop
mapreduce
spark
apacheclustercomputinghadoopmapreducespark
(0)
copydelete
- community post
- history of this post
4Welcome to Pig!
Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets. At the present time, Pig's infrastructure layer consists of a compiler that produces sequences of Map-Reduce programs, for which large-scale parallel implementations already exist Pig's language layer currently consists of a textual language called Pig Latin, which has the following key properties: * Ease of programming. It is trivial to achieve parallel execution of simple, "embarrassingly parallel" data analysis tasks. * Optimization opportunities. The way in which tasks are encoded permits the system to optimize their execution automatically * Extensibility.
13 years ago by @draganigajic
show all tags
apache
datamining
hadoop
java
mapreduce
apachedatamininghadoopjavamapreduce
(0)
copydelete
- community post
- history of this post
1Sqoop « Cloudera » Apache Hadoop for the Enterprise
Sqoop is a tool designed to import data from relational databases into Hadoop. Sqoop uses JDBC to connect to a database. It examines each table’s schema and automatically generates the necessary classes to import data into the Hadoop Distributed File System (HDFS). Sqoop then creates and launches a MapReduce job to read tables from the database via DBInputFormat, the JDBC-based InputFormat. Tables are read into a set of files in HDFS. Sqoop supports both SequenceFile and text-based target and includes performance enhancements for loading data from MySQL.
15 years ago by @gresch
show all tags
apache
db
dbms
hadoop
hdfs
java
mapreduce
software
sql
apachedbdbmshadoophdfsjavamapreducesoftwaresql
(0)
copydelete
- community post
- history of this post
4Apache Mahout - Overview
http://lucene.apache.org/mahout/
15 years ago by @dolefulrabbit
show all tags
analytics
apache
clustering
datamining
distributed
hadoop
java
library
lucene
machinelearning
mapreduce
recommendation
scalable
software
analyticsapacheclusteringdataminingdistributedhadoopjavalibrarylucenemachinelearningmapreducerecommendationscalablesoftware
(0)
copydelete
- community post
- history of this post
19Welcome to Apache Hadoop!
The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing, including:
16 years ago by @carlfischer
show all tags
apache
cluster
distributed
grid
hadoop
java
mapreduce
opensource
apacheclusterdistributedgridhadoopjavamapreduceopensource
(0)
copydelete
- community post
- history of this post
1Amazon Web Services Developer Community : Running Hadoop MapReduce on Amazon EC2 and Amazon S3
Apache's Hadoop project aims to solve these problems by providing a framework for running large data processing applications on clusters of commodity hardware. Combined with Amazon EC2 for running the application, and Amazon S3 for storing the data, we can run large jobs very economically. This paper describes how to use Amazon Web Services and Hadoop to run an ad hoc analysis on a large collection of web access logs that otherwise would have cost a prohibitive amount in either time or money.
16 years ago by @carlfischer
show all tags
amazon
apache
cluster
distributed
ec2
hadoop
mapreduce
programming
amazonapacheclusterdistributedec2hadoopmapreduceprogramming
(0)
copydelete
- community post
- history of this post
1HadoopMapReduce - Hadoop Wiki
Introduction This document describes how Map and Reduce operations are carried out in Hadoop. If you are not familiar with the Google [WWW] MapReduce programming model you should get acquainted with it first.
16 years ago by @carlfischer
show all tags
apache
cluster
google
hadoop
mapreduce
programming
apacheclustergooglehadoopmapreduceprogramming
(0)
copydelete
- community post
- history of this post

⟨⟨
⟨
1
⟩
⟩⟩

publications (hide)
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...

No matching posts.

⟨⟨
⟨
⟩
⟩⟩

BibSonomy

bookmarks (hide)7
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

3Apache Spark™ - Lightning-Fast Cluster Computing

4Welcome to Pig!

1Sqoop « Cloudera » Apache Hadoop for the Enterprise

4Apache Mahout - Overview

19Welcome to Apache Hadoop!

1Amazon Web Services Developer Community : Running Hadoop MapReduce on Amazon EC2 and Amazon S3

1HadoopMapReduce - Hadoop Wiki

publications (hide)
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...

browse

related tags

bookmarks (hide)7 displayallbookmarks onlybookmarks per page5102050100 sort byadded attitle RSSBibTeXXML

publications (hide) displayallpublications onlypublications per page5102050100 sort byadded attitleauthorpublication dateentry typehelp for advanced sorting... RSSBibTeXRDFmore...

browse

related tags

bookmarks (hide)7
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

publications (hide)
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...