Font Size: a A A

Research Of Personalized Information Retrieval Technology Based On The User Interest Model

Posted on:2016-04-06Degree:MasterType:Thesis
Country:ChinaCandidate:X Y LiFull Text:PDF
GTID:2308330464965771Subject:Computer technology
Abstract/Summary:PDF Full Text Request
At present, the information on the internet is increasing with exponential growth and has been in constant changing. It is difficult for the users to find the information rapidly and accurately which is really interested in with the traditional search engine technology which only relies on the keyword matching. The personalized search technology which is based on user interest has become a research hotspot. In this paper, to solve the existing problems of the traditional information retrieval technology, it makes a further research on personalized information retrieval technology which is based on user interest model. The main research content is as follows:Firstly, the related theory and technology of search engine, personalized information retrieval and user interest model is further studied, and a personalized information retrieval system based on user interest model compared to traditional information retrieval is designed, and the framework and implementation process of the system is present.Secondly, a user interest model is built and its updating method is put forward. The system will preprocess the history and favorite web page of users, and TF- IDF algorithm is used to calculate the key words and its weights, then select out the top N from all the keywords according to the size of the weights, at last, the initial interest model is built. Then the forgetting factor is used to forget the initial interest, at the same time, the system will calculate the user’s interest degree according to the user’s web browsing behavior. The web pages whose interest degree is larger will be stored and further processed to extract the new interest vector which will be used to update the initial interest model.Thirdly, calculating the similarity between the search results and the user interest model is presented. The cosine similarity calculation algorithm is used to calculate the similarity between the initial results documents and the user interest model, the results will be reordered in accordance with the similarity of the size.Finally, a personalized information retrieval prototype system was established by Java on the basis of Lucene open source framework. At first, the system collects data through the web crawlers, then builds the index and implements the query, finally, the similarity between the search results and the user interest model is calculated, then reorders the results according to the similarity of size in order to make the search results which is related to the user’s interest come in the front position. The experiment proves the feasibility of this method, and this method can improve the performance and users’ satisfaction of the retrieval systems.
Keywords/Search Tags:user interest model, personalized, Search Engine
PDF Full Text Request
Related items