A Study Of Traffic Aware Partition And Aggregation In Mapreduce For Big Data Applications

Lade Praveenkumar, Gugulothu Praveen

Abstract


MapReduce programming framework process large amount of data by taking advantage of parallel Map and Reduce tasks. Computationally MapReduce has two phases called Map and Reduce. In actual implementation, it has another phase called Shuffle where data transfer takes place. Conventionally Shuffle phase use Hash function to partition data which is inefficient in handling Traffic leading to a bottleneck. Improving the performance of network traffic inshuffle phase is important to improve the performance of MapReduce. The goal of minimization of network traffic is achieved by using partition and aggregation. The proposed scheme is designed to minimize network traffic cost in MapReduce. The problem of aggregator placement is considered, where each aggregator can reduce combined traffic from multiple map tasks.A decomposition-based distributed algorithm is proposed to deal with the large-scale optimization problem for big data applications. Also, an online algorithm is designed to dynamically adjust data partition and aggregation


Full Text:

PDF




Copyright (c) 2017 Edupedia Publications Pvt Ltd

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

 

All published Articles are Open Access at  https://journals.pen2print.org/index.php/ijr/ 


Paper submission: ijr@pen2print.org