Article,

Improving Map Reduce Performance in Heterogeneous Distributed System using HDFS EnvironmentA Review

, and .
International Journal on Recent and Innovation Trends in Computing and Communication, 3 (3): 903--910 (March 2015)
DOI: 10.17762/ijritcc2321-8169.150301

Abstract

Hadoop is a Java-based programming framework which supports for storing and processing big data in a distributed computing environment. It is using HDFS for data storing and using Map Reduce to processing that data. Map Reduce has become an important distributed processing model for large-scale data-intensive applications like data mining and web indexing. Map Reduce is widely used for short jobs requiring low response time. The current Hadoop implementation assumes that computing nodes in a cluster are homogeneous in nature. Unfortunately, both the homogeneity and data locality assumptions are not satisfied in virtualized data centers. Hadoop’s scheduler can cause severe performance degradation in heterogeneous environments. We observe that, Longest Approximate Time to End (LATE), which is highly robust to heterogeneity. LATE can improve Hadoop response times by a factor of 2 in clusters.

Tags

Users

  • @ijritcc

Comments and Reviews