Data summarization using clustering and classification : spectral clustering combined with k-means using NFPH

Niroj Sapkota, Abeer Alsadoon, P. W. C. Prasad, A. Elchouemi, Ashutosh Kumar Singh

Research output: Chapter in Book / Conference PaperConference Paperpeer-review

11 Citations (Scopus)

Abstract

![CDATA[Clustering has been very helpful in knowledge discovery. Data miners are focused in creating quality clusters with reduced time complexity to get the most significant information. This paper aims to analyse existing techniques used in data mining for clustering and find ways to maximize accuracy of clustering. The purpose of our paper is to improve an existing clustering algorithm. This paper will introduce a novel algorithm by combining Spectral clustering with k-means with NFPH. The proposed system replaces the initialization method for cluster centroids in classical k-means algorithms which should solve some of the limitations of the k-means algorithm. We aim to select the most appropriate first centroid rather than selecting randomly. Test data sets from the medical domain which are available for research purposes will be used to train the model and an open source data mining application called WEKA is used for testing. From tests carried out on 10 different UCI data sets using the proposed solution, we found that the clustering error was reduced up to 2 percent while the processing time increased from 45 seconds. The increase in processing time is caused by the replacement of the initialization method of k-means. The proposed system reduced the clustering error of the spectral clustering algorithm. This system improved levels of accuracy but the processing time increased to 4 seconds.]]
Original languageEnglish
Title of host publicationProceedings of the International Conference on Machine Learning, Big Data, Cloud and Parallel Computing: Trends, Prespectives and Prospects (COMITCon 2019), 14th-16th February, 2019, India
PublisherIEEE
Pages146-151
Number of pages6
ISBN (Print)9781728102115
DOIs
Publication statusPublished - 2019
EventInternational Conference on Machine Learning_Big Data_Cloud and Parallel Computing -
Duration: 14 Feb 2019 → …

Conference

ConferenceInternational Conference on Machine Learning_Big Data_Cloud and Parallel Computing
Period14/02/19 → …

Fingerprint

Dive into the research topics of 'Data summarization using clustering and classification : spectral clustering combined with k-means using NFPH'. Together they form a unique fingerprint.

Cite this