Font Size: a A A

Personalized Search Sort Research Based On User Interest Model

Posted on:2016-12-20Degree:MasterType:Thesis
Country:ChinaCandidate:K XuFull Text:PDF
GTID:2308330467973357Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the advent of the Information Age, the internet data scale exponential growth. On theone hand, the internet crawl catch the data is far away from the speed of information explosion,on the other hand, the number of netizens and their quality and ability are improved, that meansthe search engine need do more to match netizens’s requirement. How to provision a good userexperience and more accurate personalized ranking results is search engine’s research hotspotand development trend. This topic is study of personalized search engine ranking factor based onuser interest model.This topic describes the architecture and principle of search engine, propose concept ofpersonalized factor, analyze how to build and update user interest model. Finally, implement aprototype system of personal search engine based on user interest. The main work as follows:1. Analyze the framework of personalized search engine. Include based on search wordsimprove, set web page weight, merge the Meta search engine and personalized web crawler.We choose the way that combine based on search words improve and set web page weightto build the search engine.2. Build the user interest model. We come up with the formula that judge weather the page isuser’s interest and decoupling the interest model and user interest model. Building thehierarchical tree interest model by ODP, user interest model is vector with interested keywords and weight, and they linked each other by mapping relation when application in theactual. The most work is study the scheme of building user interest model. We get page keywords from interest web page, get user interest words from page ke y words by formula andcompute the weights of user interest words based on words’ position as it appears. Userinterest model’s update show by word weights’ change, we use different forgetting factor tolong interest, short interest and the grade key words belong.3. Introduction of personalized factor in Lucene formula. Analyze the mechanism of Lucenescoring algorithm. As Lucene is open source and its good expansibility, we add use interestmodel’s weight to algorithm to make the sorting result reflect user’s interests.4. Finishing the prototype system of personalized search engine and comparing the result ofpersonalized and normal. We set up platform make use of Nutch and Solr, calling the SolrAPI service in program code. Considering Solr is not good at Chinese words segmentation,using third-party IKAnalyzer plug-in. Lastly, we choose several keywords in query and analysis the different result, the results prove the feasibility of the personalized factor usedin search engine.
Keywords/Search Tags:user interest model, personalized factor, interested page, search engine
PDF Full Text Request
Related items