A Comprehensive Study of Clustering Algorithms to Analyze Medical Datasets

Madhuri Potnuru, G Lavanya Devi, N Naresh


The objective of this research work is focused on the ethical cluster creation of heart disease data and analyzed the performance of partition based algorithms. This research work would help the doctors to identify the stages of heart disease and also enhances the medical care. One of the most difficult jobs in medicine is to diagnose a disease. The recognition of heart disease from diverse features or signs is a major issue which is not free from false presumptions often accompanied by unpredictable effects. Unfortunately, the huge amount of data about the heart diseases provided by the health care industries are not useful to give information for effective diagnosing. Increase in these stats data which will be a look for the researchers to dig into these medical databases for useful information. As there is an increase in the volume of stored data, as well as to find the patterns and to extract the knowledge for providing better patient care and to provide effective capabilities for diagnosis, this can be done using the data mining techniques. Predictions for the Heart disease goes wrong highly due to missing of data, due to which stats goes wrong which results in approximate results, which are ineffective in diagnostic procedures. Imputation is one the solution for this problem. This imputation method will help to replace the missing attributes from the datasets by the 13 medical attributes which are from the Cleveland heart disease database. Most of the researchers analyzed the heart disease dataset using algorithms to find the cluster among the small cell or non-small cell heart disease in various stages. The very famous two partition based algorithms namely K-Means and Hierarchical clustering are implemented. A comparative

analysis of clustering algorithms is also carried out using two different datasets. Data clustering is a process which is a collection of similar data considering them as a group. A clustering algorithm divides a data set into several groups such that the similarity within a group is larger than among groups. This thesis analyze three types of clustering techniques- k- means, hierarchical and dbscan clustering algorithms. The performance and various other attributes of the three techniques are presented and compared.

Full Text:


Copyright (c) 2018 Edupedia Publications Pvt Ltd

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


All published Articles are Open Access at  https://journals.pen2print.org/index.php/ijr/ 

Paper submission: ijr@pen2print.org