PERFORMANCE ANALYSIS OF MAPREDUCE WTH LARGE DATASETS USING HADOOP

V. Sri Divya, M. Tejaswini, Sk Parveen, B. Triveni

Abstract


Big data[1] is a huge amount of data that cannot be managed by traditional data management systems. Hadoop[2] is a tool that is used to handle this big data. For storing and retrieving the bigdata hadoop distributed file[4] system(HDFS) and mapreduce[3] are used respectively. Even petabytes or terabytes of data can be stored and retrieved easily using these techniques. This paper provides introduction to hadoop HDFS and Mapreduce. In this paper we have used large datasets to analyse the performance of mapreduce technique. Number of bytes read and written while performing mapreduce task on input given is also observed. We have analysed the behaviour of mapreduce task by varying the amount of input given. Also the pattern of number of bytes read and written when given input is varied is also analysed.


Full Text:

PDF




Copyright (c) 2017 Edupedia Publications Pvt Ltd

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

 

All published Articles are Open Access at  https://journals.pen2print.org/index.php/ijr/ 


Paper submission: ijr@pen2print.org