Categorizing visitors dynamically by fast and robust clustering of access logs

Vladimir Estivill-Castro, Jianhua Yang

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)

Abstract

Clustering plays a central role in segmenting markets. The identification of categories of visitors to a Web-site is very useful towards improved Web applications. However, the large volume involved in mining visitation paths, demands efficient clustering algorithms that are also resistant to noise and outliers. Also, dissimilarity between visitation paths involves sophisticated evaluation and results in large dimension of attribute-vectors. We present a randomized, iterative algorithm (a la Expectation Maximization or k-means) but based on discrete medoids. We prove that our algorithm converges and that has subquadratic complexity. We compare to the implementation of the fastest version of matrixbased clustering for visitor paths and show that our algorithm outperforms dramatically matrix-based methods.

Original languageEnglish
Pages (from-to)498-507
Number of pages10
JournalAgents for Games and Simulations II
Volume2198
DOIs
Publication statusPublished - 2001
Externally publishedYes

Fingerprint

Dive into the research topics of 'Categorizing visitors dynamically by fast and robust clustering of access logs'. Together they form a unique fingerprint.

Cite this