A Smart Web Crawler-An Efficient Harvesting Deep-Web Interfaces Using Site Ranker and Adoptive Learning

Sreenivasa M, Jagadish R. M

Abstract


The cyber world is a verity collection of billions of web pages containing terabytes of information arranged in thousands of servers using HTML. The size of this amassment itself is a difficult to retrieving required and relevant information. This made search engines a paramount part of our lives. Search engines strive to retrieve information as useful as possible. One of the building blocks of search engines is the Web Crawler. The main idea is to propose a an efficient harvesting deep-web interfaces using site ranker and adoptive learning methodology framework, concretely two keenly intellective Crawlers, for efficient accumulating deep web interfaces. Within the first stage, A Smart Web Crawler performs site-predicated sorting out centre pages with the support of search engines, evading visiting an oversized variety of pages. To realize supplemental correct results for a targeted crawl, keenly belong to the Crawler, ranks websites to inductively authorize prodigiously relevant ones for a given topic. Within the second stage, smart Crawler, achieves quick in website looking by excavating most useful links with associate degree accommodative link -ranking.

Keywords: Adaptive learning; best first search; deep web; feature selection; ranking; two stage crawler

Full Text:

PDF




Copyright (c) 2016 Sreenivasa M, Jagadish R. M

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

 

All published Articles are Open Access at  https://journals.pen2print.org/index.php/ijr/ 


Paper submission: ijr@pen2print.org