Font Size: a A A

Research On Pseudo Relevance Feedback Based On Document Similarity

Posted on:2018-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y N LiuFull Text:PDF
GTID:2348330518987214Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, the network information showing explosive growth trend. Usually,according to the user information needs,using search engine for targeted search can greatly improve the efficiency of information access.However, the existence of network information is unstructured, widely distributed,open organization,diverse forms,rapid updates and many other factors to make information retrieval more difficult. How to further improve the efficiency of information access is still an important issue in the field of information retrieval.The expression of user for information needs is often not accurate,resulting in the user query and document information can not be a good match. Therefore, the traditional information retrieval model often can not accurately meet the user query needs. To a certain extent, the pseudo-relevance feedback model makes the problem to be solved. The pseudo-relevance feedback model is based on the initial retrieval to return the document information, and the initial query is modified and expanded to improve the retrieval performance. This technique is an effective method to optimize the query in information retrieval, which has shown important research value and practical significance.In this paper, the main research work is as follows:First of all, based on the analysis of pseudo-relevance feedback, taking advantage of the return of the top-ranked N documents, adjust the user query weight, without adding extended words, re-examining the similarity of those top documents and weighting this set based on their context. In addition, combining with different traditional retrieval models to further improve the retrieval performance. In this paper,we conducted a number of experimental tests based on TREC datasets.The results show that the pseudo-relevance feedback method based on the re-weighting informative query terms is improved comparing with the traditional retrieval method.Secondly,the method of re-weighting the document similarity is combined with the query expansion. The initial query terms are re-weighted and the relevant query terms are added to form a new query. In this paper, we use the traditional modeling model to obtain the extended terms, using the document similarity method to adjust the weight of the initial query terms and smooth with the extended terms. Fully considered the importance of the new query, and combined with two common expansion methods.The results show that the method can improve the retrieval performance effectively.Finally, this paper designs and implements an information retrieval system based on document similarity. The system mainly includes two modules, the retrieval module and the user interaction module. The retrieval module mainly analyzes the document set, preprocessing and document retrieval. User interaction module implements a user login system,selecting a single data set, through the bar you can quickly understand the experimental results of the data set in a variety of search methods.
Keywords/Search Tags:Information Retrieval, Document Similarity, Pseudo Relevance Feedback, Query Expansion
PDF Full Text Request
Related items