Nearest Keyword Set Search in Multidimensional Datasets

SHAIK NURJAHA; K.SANDHYA RANI

Nearest Keyword Set Search in Multidimensional Datasets

SHAIK NURJAHA, K.SANDHYA RANI

Abstract

Keyword-based search in text-rich multi-dimensional datasets facilitates many novel applications and tools. In this paper, weconsider objects that are tagged with keywords and are embedded in a vector space. For these datasets, we study queries that ask forthe tightest groups of points satisfying a given set of keywords. We propose a novel method called ProMiSH (Projection and Multi ScaleHashing) that uses random projection and hash-based index structures, and achieves high scalability and speedup. We present anexact and an approximate version of the algorithm. Our experimental results on real and synthetic datasets show that ProMiSH has upto 60 times of speedup over state-of-the-art tree-based techniques.

Full Text:

PDF

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

All published Articles are Open Access at https://journals.pen2print.org/index.php/ijr/

Paper submission: ijr@pen2print.org

Username
Password
Remember me

International Journal of Research

Nearest Keyword Set Search in Multidimensional Datasets

Abstract

Full Text: