Font Size: a A A

Research And Implementation Of Book Search Engine Based On User Personalization

Posted on:2019-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:X LiFull Text:PDF
GTID:2428330572451569Subject:Engineering
Abstract/Summary:PDF Full Text Request
The recent years have witnessed a quick emergence of book resources in the Internet.Online readers are also willingly offered with much more extensive and various choices.At the same time,how to get the book they really need in a quick and accurate way when exposed to the sea of resources has emerged as a paramount challenge facing the search users.The vertical search engine used for searching books came into being soon after and has been developed and popularized by now.However,there are some existing problems of traditional book search engine,such as over-commercialized and the limit of query range.Besides,the traditional one does neglect users' personalization demands,so the lists of search results would always be the same when users input an identical key word.Therefore,we proposes a research plan of book search engine based on user personalization.Firstly,by setting up the Douban Book as the only data source,we obtains sufficient book data and user data from this website with the help of opening API interface and the topic web crawler.On the one hand,the pre-processed data is used to build a rich index library.On the other hand,it is used for the research of personalized search algorithms.Collaborative tagging system allows all users to manage resources through user-defined tags.The flexibility and usability of tag make itself an important medium between library resources and users' interests,but also bring a certain cost of processing.In order to decrease noise and simplify computation,we use a hierarchical clustering algorithm for tags,which makes the user's preferences more centralized.After,we build user interest model and document topic model based on the result of tag clustering.For the sparsity of user's tag data,we focus on the analysis of the scores given by users who have read the same books,apply the improved model of user similarity computing to get the recommended book labels of similar users,and finally add them to the users' interest set.Based on collaborative filtering,the whole process successfully detect potential interests of users and expand their interest scope.Then,we accept the mechanism of secondary sort in this paper.By integrating the user interest model in Xapian,and calculating the BM25 correlation score and the personalization score of the documents,we realize the algorithm of personalized retrieval.Comparative experiments has been set to verify the effectiveness of the personalized search algorithm.Finally,drawing on the mainstream architecture of search engine,we have completed outline design,detailed design,function implementation and system testing.The users of Douban Book can log in to the system,send query requests,search for their interested books,and get personalized search results consistent with their interest characteristics.
Keywords/Search Tags:Vertical search engine, Personalized search engine, Topic web crawler, User interest model, Page ranking mechanism
PDF Full Text Request
Related items