Skip to main navigation Skip to search Skip to main content

Approximate document outlier detection using random spectral projection

Research output: Contribution to journalArticlepeer-review

5 Citations (Scopus)

Abstract

Outlier detection is an important process for text document collections, but as the collection grows, the detection process becomes a computationally expensive task. Random projection has shown to provide a good fast approximation of sparse data, such as document vectors, for outlier detection. The random samples of Fourier and cosine spectrum have shown to provide good approximations of sparse data when performing document clustering. In this article, we investigate the utility of using these random Fourier and cosine spectral projections for document outlier detection. We show that random samples of the Fourier spectrum for outlier detection provides better accuracy and requires less storage when compared with random projection. We also show that random samples of the cosine spectrum for outlier detection provides similar accuracy and computational time when compared with random projection, but requires much less storage.
Original languageEnglish
Pages (from-to)579-590
Number of pages12
JournalLecture Notes in Computer Science
Volume7691
DOIs
Publication statusPublished - 2012

Keywords

  • Fourier transformations
  • documents
  • outlier detection
  • random projections
  • spectral projections

Fingerprint

Dive into the research topics of 'Approximate document outlier detection using random spectral projection'. Together they form a unique fingerprint.

Cite this