Abstract
Latent semantic analysis (LSA) is a generalized vector space method that uses dimension reduction to generate term correlations for use during the information retrieval process. We hypothesized that even though the dimension reduction establishes correlations between terms, the dimension reduction is causing a degradation in the correlation of a term to itself (self-correlation). In this article, we have proven that there is a direct relationship to the size of the LSA dimension reduction and the LSA self-correlation. We have also shown that by altering the LSA term self-correlations we gain a substantial increase in precision, while also reducing the computation required during the information retrieval process.
Original language | English |
---|---|
Pages (from-to) | 0.334027777777778-0.357638888888889 |
Number of pages | 35 |
Journal | ACM Transactions on Information Systems |
Volume | 27 |
Issue number | 2 |
DOIs | |
Publication status | Published - 2009 |
Keywords
- information retrieval
- latent semantic indexing