Font Size: a A A

Information Retrieval System Based On A Improvement Language Model In Relevance Feedback

Posted on:2012-12-31Degree:MasterType:Thesis
Country:ChinaCandidate:X S LiFull Text:PDF
GTID:2178330335960295Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the development of internet, the boosting speed of information on internet surpasses people's expectations, which push people into a information explosion age. However, the fuzziness character of people's information need lead to the search result can not satisfy the requirement of people. In order to improve the retrieval result of information retrieval, researcher need to adjust several retrieval model and different feedback method to improve the accuracy of retrieval result. There are four popular models in information retrieval, Boolean model, vector space model, probability model and language model. This paper summarize sand analyze those model. Then it introduces the feedback technology based on classic model and retrieval system performance evaluation methodology.First, this paper introduces a feedback method based on combination of improved language model and space vector model to improve the retrieval system. And it can extract the expanding terms thought this improved language model, Then those terms will be flitted though expending term classification algorithm and entity extracting algorithm. The last expanding terms will add to the original query and give the respond weight. While being tested on TREC feedback 08 dataset, get a 35 percent improvement on MAP.Second, the feature selection problem in expanding term category is investigated. Feature including distribution of expanding term, single query term co-occurrence, double query term co-occurrence, and the weight of term. The training sample comes from TREC feedback 09 dataset. Then we use them to divide the TREC 10 data set. the result is very well, and improve the performance of information retrieval. Third, we introduced based on improved language model retrieval system we developed with the techniques above. This system combines the relevance document clustering expanding, expanding term classification technology, which can improve the performance of retrieval system. In TREC 2009 and 2010 feedback Track, our system is chosen as one of the baselines for our good relevance retrieval performance.
Keywords/Search Tags:search engine, feedback, language model, cluster and classification
PDF Full Text Request
Related items