AN EFFICIENT TRAFFIC-AWARE SEPARATION AND AGGRIGATION USING MAPREDUCE FOR BIG DATA APPLICATIONS
Abstract
The MapReduce programming model simplifies the processing large datasets. Mapreduce is typically used to do distributed computing on cluster of computers, exploiting parallel map tasks and reduce tasks. Although many efforts have been made to improve the performance of MapReduce jobs, they ignore the network traffic generated in the shuffle phase, which plays a critical role in performance enhancement. Traditionally, a hash function is used to separate intermediate data among reduce tasks however, it is not traffic-efficient. In this paper, we study to reduce network traffic cost for a MapReduce job by designing a novel intermediate data separation scheme. A decomposition-based distributed algorithm is proposed to deal
with the large-scale optimization problem for big data application and an online algorithm is also designed to adjust data separation and aggregation in a dynamic manner. Finally, extensive simulation results demonstrate that our proposals can significantly reduce network
traffic cost.
Full Text:
PDFCopyright (c) 2017 Edupedia Publications Pvt Ltd
![Creative Commons License](http://licensebuttons.net/l/by-nc-sa/4.0/88x31.png)
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
All published Articles are Open Access at https://journals.pen2print.org/index.php/ijr/
Paper submission: ijr@pen2print.org