A Traffic Minimization Approach for Big Data in Map Reduce Job by Intermediate Data Partition Technique

K. Mounika, J. Rajashekar

Abstract


MapReduce is a programming model and an associated implementation for processing and generating large datasets that is amenable to a broad variety of real-world tasks. Scheduling map tasks to improve data locality is crucial to the performance of MapReduce. Many works have been devoted to increasing data locality for better efficiency. However, to the best of our knowledge, fundamental limits of MapReduce computing clusters with data locality, including the capacity region and theoretical bounds on the delay performance, have not been studied. we propose the on traffic aware partition and aggregation in order to reduce the network cost for map reduce jobs by designing an intermediate data partition scheme. Moreover, we together consider the aggregator placement issue, where each aggregator can reduce merged traffic from more than one map duties. A decomposition-primarily based distributed algorithm is proposed to address the large-scale optimization trouble for a big data application and an online algorithmic rule is also designed to adjust network data partition and aggregation in a dynamic way.


Full Text:

PDF




Copyright (c) 2018 Edupedia Publications Pvt Ltd

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

 

All published Articles are Open Access at  https://journals.pen2print.org/index.php/ijr/ 


Paper submission: ijr@pen2print.org