TY - GEN
T1 - Query expansion for the language modelling framework using the naïve Bayes assumption
AU - Park, Laurence A.F.
AU - Ramamohanarao, Kotagiri
PY - 2008
Y1 - 2008
N2 - Language modelling is new form of information retrieval that is rapidly becoming the preferred choice over probabilistic and vector space models, due to the intuitiveness of the model formulation and its effectiveness. The language model assumes that all terms are independent, therefore the majority of the documents returned to the ser will be those that contain the query terms. By making this assumption, related documents that do not contain the query terms will never be found, unless the related terms are introduced into the query using a query expansion technique. Unfortunately, recent attempts at performing a query expansion using a language model have not been in-line with the language model, being complex and not intuitive to the user. In this article, we introduce a simple method of query expansion using the naïve Bayes assumption, that is in-line with the language model since it is derived from the language model. We show how to derive the query expansion term relationships using probabilistic latent semantic analysis (PLSA). Through experimentation, we show that using PLSA query expansion within the language model framework, we can provide a significant increase in precision.
AB - Language modelling is new form of information retrieval that is rapidly becoming the preferred choice over probabilistic and vector space models, due to the intuitiveness of the model formulation and its effectiveness. The language model assumes that all terms are independent, therefore the majority of the documents returned to the ser will be those that contain the query terms. By making this assumption, related documents that do not contain the query terms will never be found, unless the related terms are introduced into the query using a query expansion technique. Unfortunately, recent attempts at performing a query expansion using a language model have not been in-line with the language model, being complex and not intuitive to the user. In this article, we introduce a simple method of query expansion using the naïve Bayes assumption, that is in-line with the language model since it is derived from the language model. We show how to derive the query expansion term relationships using probabilistic latent semantic analysis (PLSA). Through experimentation, we show that using PLSA query expansion within the language model framework, we can provide a significant increase in precision.
KW - Language model
KW - Naïve Bayes
KW - Query expansion
UR - https://www.scopus.com/pages/publications/44649182157
U2 - 10.1007/978-3-540-68125-0_64
DO - 10.1007/978-3-540-68125-0_64
M3 - Conference Paper
AN - SCOPUS:44649182157
SN - 3540681248
SN - 9783540681243
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 681
EP - 688
BT - Advances in Knowledge Discovery and Data Mining - 12th Pacific-Asia Conference, PAKDD 2008, Proceedings
T2 - 12th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2008
Y2 - 20 May 2008 through 23 May 2008
ER -