Abstract
This paper presents the process of refining the document and their terms in Information Retrieval. It also shows the significance of this process prior to applying any of the information retrieval applications including probability models on the actual terms distribution. This is an important issue in language models approach, it also helps to show the effectiveness and efficiency in terms of minimizing the amount of time and space required to process the data. This is also very important for probabilistic approaches such as Single Poisson, double Poisson, Binomial and Multinomial distributions which are used to define the weights in the document matching process. This approach is applied on specific data sources rather than Web pages.
Original language | English |
---|---|
Title of host publication | Proceedings of the 27th International Workshop on Statistical Modelling, July 16-20, 2012, Prague |
Publisher | Tribun EU |
Pages | 601-606 |
Number of pages | 6 |
ISBN (Print) | 9788026302513 |
Publication status | Published - 2012 |
Event | International Workshop on Statistical Modelling - Duration: 16 Jul 2012 → … |
Conference
Conference | International Workshop on Statistical Modelling |
---|---|
Period | 16/07/12 → … |
Keywords
- information retrieval
- distribution (probability theory)
- Poisson algebras