Fast Data Collection for High Dimensional Data in Data Mining

Mr. Raju, Pooja Srivastvas

Abstract


In machine learning, feature selection is preprocessing step and can be effectively reduce high dimensional data, remove irrelevant data, increase learning accuracy, and improve result comprehensibility. High dimensionality of data takes over efficiency and effectiveness points of view in feature selection algorithm. Efficiency stands required time to find a subset of features, and the effectiveness belongs to good quality of the subset of features. In feature selection technique high dimensional data contains many irrelevant and redundant features. Irrelevant features make available no useful information in any context, and redundant features provide no more information than the selected features. Good feature subsets contain features highly predictive of (correlated with) the class, yet not predictive of (uncorrelated with) each other. A subset of useful features to produce compatible results as the original set of features is identified from feature selection.

Keywords


Feature subset selection; graph-theoretic clustering; filter method

Full Text:

PDF




Copyright (c) 2015 Mr. Raju, Pooja Srivastvas

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

 

All published Articles are Open Access at  https://journals.pen2print.org/index.php/ijr/ 


Paper submission: ijr@pen2print.org