A novel Web text mining method using the discrete cosine transform

Laurence A.F. Park, Marimuthu Palaniswami, Kotagiri Ramamohanarao

Research output: Chapter in Book / Conference PaperConference Paperpeer-review

11 Citations (Scopus)

Abstract

Fourier Domain Scoring (FDS) has been shown to give a 60% improvement in precision over the existing vector space methods, but its index requires a large storage space. We propose a new Web text mining method using the discrete cosine transform (DCT) to extract useful information from text documents and to provide improved document ranking, without having to store excessive data. While the new method preserves the performance of the FDS method, it gives a 40% improvement in precision over the established text mining methods when using only 20% of the storage space required by FDS.

Original languageEnglish
Title of host publicationPrinciples of Data Mining and Knowledge Discovery - 6th European Conference, PKDD 2002, Proceedings
EditorsTapio Elomaa, Heikki Mannila, Hannu Toivonen
PublisherSpringer Verlag
Pages385-397
Number of pages13
ISBN (Print)3540440372, 9783540440376
DOIs
Publication statusPublished - 2002
Externally publishedYes
Event6th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD 2002 - Helsinki, Finland
Duration: 19 Aug 200223 Aug 2002

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2431 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference6th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD 2002
Country/TerritoryFinland
CityHelsinki
Period19/08/0223/08/02

Fingerprint

Dive into the research topics of 'A novel Web text mining method using the discrete cosine transform'. Together they form a unique fingerprint.

Cite this