Estimation of the number of clusters using multiple clustering validity indices

Krzysztof Kryszczuk, Paul Hurley

Research output: Chapter in Book / Conference PaperConference Paperpeer-review

62 Citations (Scopus)

Abstract

One of the challenges in unsupervised machine learning is finding the number of clusters in a dataset. Clustering Validity Indices (CVI) are popular tools used to address this problem. A large number of CVIs have been proposed, and reports that compare different CVIs suggest that no single CVI can always outperform others. Following suggestions found in prior art, in this paper we formalize the concept of using multiple CVIs for cluster number estimation in the framework of multi-classifier fusion. Using a large number of datasets, we show that decision-level fusion of multiple CVIs can lead to significant gains in accuracy in estimating the number of clusters, in particular for high-dimensional datasets with large number of clusters.

Original languageEnglish
Title of host publicationMultiple Classifier Systems - 9th International Workshop, MCS 2010, Proceedings
PublisherSpringer Verlag
Pages114-123
Number of pages10
ISBN (Print)3642121268, 9783642121265
DOIs
Publication statusPublished - 2010
Externally publishedYes
Event9th International Workshop on Multiple Classifier Systems, MCS 2010 - Cairo, Egypt
Duration: 7 Apr 20109 Apr 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5997 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference9th International Workshop on Multiple Classifier Systems, MCS 2010
Country/TerritoryEgypt
CityCairo
Period7/04/109/04/10

Keywords

  • Clustering
  • Clustering validity indices
  • Multiple classifier

Fingerprint

Dive into the research topics of 'Estimation of the number of clusters using multiple clustering validity indices'. Together they form a unique fingerprint.

Cite this