A Novel Approach forProgressive of Duplicate Detection

Konga Santhoshkumar, M. Omprakash

Abstract


The presence of duplicate records is a fundamental information first-rate situation in colossal databases. To detect duplicates, entity decision also known as duplication detection or document linkage is used as a part of the info cleansing system to determine files that potentially refer to the equal actual world entity. O become aware of the duplicity with much less time of execution and likewise without disturbing the dataset excellent, methods like progressive blockading and progressive local are used. Innovative sorted nearby procedure also referred to as as PSNM is used on this mannequin for finding or detecting the reproduction in a parallel method. Progressive blocking off algorithm works on massive datasets where finding duplication requires immense time. These algorithms are used to increase reproduction detection approach. The effectivity may also be doubled over the traditional duplicate detection approach making use of this algorithm.


Keywords


Data Duplicity Detection, Progressive deduplication, PSNM, Data Mining

Full Text:

PDF




Copyright (c) 2016 Edupedia Publications Pvt Ltd

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

 

All published Articles are Open Access at  https://journals.pen2print.org/index.php/ijr/ 


Paper submission: ijr@pen2print.org