Fast and robust general purpose clustering algorithms

Vladimir Estivill-Castro, Jianhua Yang

Research output: Contribution to journalArticlepeer-review

61 Citations (Scopus)

Abstract

General purpose and highly applicable clustering methods are required for knowledge discovery. k-MEANS has been adopted as the prototype of iterative model-based clustering because of its speed, simplicity and capability to work within the format of very large databases. However, k-MEANS has several disadvantages derived from its statistical simplicity. We propose algorithms that remain very efficient, generally applicable, multidimensional but are more robust to noise and outliers. We achieve this by using medians rather than means as estimators of centers of clusters. Comparison with k-MEANS, EM and GIBBS sampling demonstrates the advantages of our algorithms.

Original languageEnglish
Pages (from-to)208-218
Number of pages11
JournalAgents for Games and Simulations II
Volume1886
DOIs
Publication statusPublished - 2000
Externally publishedYes

Fingerprint

Dive into the research topics of 'Fast and robust general purpose clustering algorithms'. Together they form a unique fingerprint.

Cite this