The sensitivity of latent dirichlet allocation for information retrieval

Laurence A.F. Park, Kotagiri Ramamohanarao

Research output: Chapter in Book / Conference PaperConference Paperpeer-review

13 Citations (Scopus)

Abstract

It has been shown that the use of topic models for Information retrieval provides an increase in precision when used in the appropriate form. Latent Dirichlet Allocation (LDA) is a generative topic model that allows us to model documents using a Dirichlet prior. Using this topic model, we are able to obtain a fitted Dirichlet parameter that provides the maximum likelihood for the document set. In this article, we examine the sensitivity of LDA with respect to the Dirichlet parameter when used for Information retrieval. We compare the topic model computation times, storage requirements and retrieval precision of fitted LDA to LDA with a uniform Dirichlet prior. The results show there there is no significant benefit of using fitted LDA over the LDA with a constant Dirichlet parameter, hence showing that LDA is insensitive with respect to the Dirichlet parameter when used for Information retrieval.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2009, Proceedings
Pages176-188
Number of pages13
EditionPART 2
DOIs
Publication statusPublished - 2009
Externally publishedYes
EventEuropean Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2009 - Bled, Slovenia
Duration: 7 Sept 200911 Sept 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 2
Volume5782 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceEuropean Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2009
Country/TerritorySlovenia
CityBled
Period7/09/0911/09/09

Keywords

  • Latent Dirichlet allocation
  • Probabilistic latent semantic analysis
  • Query expansion
  • Thesaurus

Fingerprint

Dive into the research topics of 'The sensitivity of latent dirichlet allocation for information retrieval'. Together they form a unique fingerprint.

Cite this