ANALYSIS OF BIG DATA PROCESSING BY DISTNCT USE OF  HADOOP’S MAPREDUCE

I. Geervani; S. Kavya; K. Abdul Hannan; N. Venkatadri

ANALYSIS OF BIG DATA PROCESSING BY DISTNCT USE OF HADOOP’S MAPREDUCE

I. Geervani, S. Kavya, K. Abdul Hannan, N. Venkatadri

Abstract

Data has become an indispensable part of every economy, industry, organization, business function and individual and such datasets that are beyond the size that traditional databases can handle are termed as Big Data. Hence companies today use a tool called Hadoop. Even sufficiently large amount of data warehouses are unable to satisfy the needs of data storage. Hadoop is designed to store large amount of data sets reliably through HDFS and MapReduce for storing and processing respectively. It is an open source software which supports parallel and distributed data processing. Hadoop also provide fault tolerance mechanism by replication. In this paper, we present introduction to HDFS and MapReduce and survey the performance of sufficiently large dataset processing using MapReduce technique. We propose that performance of the dataset processing can be optimized by leveraging MapReduce in different ways. We analysed the performance of datasets by varying the approach to process it.

Full Text:

PDF

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

All published Articles are Open Access at https://journals.pen2print.org/index.php/ijr/

Paper submission: ijr@pen2print.org

Username
Password
Remember me

International Journal of Research

ANALYSIS OF BIG DATA PROCESSING BY DISTNCT USE OF HADOOP’S MAPREDUCE

Abstract

Full Text: