Font Size: a A A

Research On Information Retrieval Models Based On Reference Document

Posted on:2011-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z T ZhangFull Text:PDF
GTID:2178330338979973Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of information technology, especially the Web is developing and widespread, the information resource grows explosively. The information retrieval (IR) system becomes more and more important as an indispensable way of accessing the information. However, the search engine returns results for user groups not each persona. But the user of search engine care about the personal result for him. So the traditional retrieval model can not meet the need of users. To solve this problem, this thesis proposes the Reference Document Model (RDM). The Reference Document Model makes use of users'reference document collections to make pseudo feedback for users'query and document collection, reflecting the users'personality successfully.The Reference Document Model is originated from Risk Minimization Model. The model space of RDM is flexible, such as vector space, probability distribution and so on. The reference document of RDM is the enrichment and expansion of document; it contains more relevant information and can boost retrieval performance. When reference documents can reflect users'preferences, the Reference Document Model becomes a personal information retrieval model.In detail, this dissertation has conducted into following researches:1. The statistical analysis of Sogou log. This paper analyzes the characters of current search engine centering around the user, query and URL clicked, presents the necessity of personalized information retrieval.2. Present the definitions of the Reference Document Model3. The performance of RDM on vector space model (VSM). As a classical information retrieval model, VSM is very important in the area of information retrieval. Firstly, this paper verifies the performance of RDM on vector space model. The results demonstrate that the performance is the best when create query model and document model through Rocchio algorithm.4. The performance of RDM on language model. Besides creating the language model of query and document, this paper study the smoothing methods of language model when combining the language model with the Reference Document Model.On being present, the outcome of this research is that performance of RDM is very good in the area of traditional text information retrieval. The study of this paper lays a good foundation for future study of the Reference Document Model.
Keywords/Search Tags:RDM, Vector Space Model, Language Model, Pseudo Feedback, Personal Information Retrieval
PDF Full Text Request
Related items