Font Size: a A A

Research On Personalized Vertical Search Technology Based On Multi Topic Aggregation

Posted on:2018-11-10Degree:MasterType:Thesis
Country:ChinaCandidate:B GeFull Text:PDF
GTID:2348330518499050Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
The Internet gave birth to the vast ocean of information,the concept of "Internet plus" under the guidance of professional field has formed a clear field characteristics of information network and fine retrieval requirements.This demand has prompted the rapid development and popularity of vertical search engines.Vertical search for professional fields with the advantage of professional information retrieval,improves the accuracy of information and information coverage in the acquisition process and improves the accuracy of information feedback in the search process to achieve the ultimate vertical field of efficient information search.With the development of vertical search engine,it also exposed a series of problems in the vertical process.That is,the narrow vertical field is a single,vertical field,emphasizing the longitudinal and ignoring the horizontal search.While emphasizing the relevance of information in the current theme,it ignores the correlation between the topics.In this paper,the background and development status of vertical search engine are discussed,and the topic classification technology and personalized recommendation technology in vertical search engine are studied in detail.At the same time,this paper studies the problems,vertical search engine encountered in the process of topic crawling,of how to improve the ability to judge the relevance of the theme of the page and how to improve the ability of the theme crawler through the tunnel.In this paper,the short board problem of Shark-Search algorithm in link value judgment is improved,and the influence weight of Page Rank is added to sort the crawling priority.In addition,this paper also improves the Shark-Search algorithm based on the word vector similarity algorithm to deal with the problem of information silos based on Shark-Search topic crawling.While the topic relevance judgment of pages depends on the thesaurus made of a large number of keywords,in this paper,we propose a keywords expansion strategy based on word frequency and co-occurrence probability to solve the problem of manual selection of keywords.In order to enhance the ability of the theme crawler to cross the tunnel,and minimize the loss of resources,this paper proposes a nearest neighbor topic network model,which is used to describe the relationship between the subject and the influence weight.The nearestneighbor topic network model can further guide the crawling process of the topic crawler and increase the information coverage of the vertical search engine.In this paper,the user's personalized vertical search strategy is studied,the search recommendation strategy based on collaborative filtering and click feedback.In view of the limitation of Hilltop algorithm highly dependent on expert pages,this paper also proposes a page sorting scheme in the case of limited selection.Combined with nearest neighbor network model,on the basis of topic search,in this paper,we provide a heuristic search with high support in the nearest neighbors.This search strategy can improve the diversity and scalability of search to a certain extent,to provide the heuristic search ability for the user's search path.Finally,according to the research content,this paper designs a vertical search engine prototype of the multi subject aggregation centered on health,and verifies the mechanism of topic crawling strategy based on word vector and nearest neighbor topic network.
Keywords/Search Tags:Vertical search, Topical crawler, Topic aggregation, Nearest neighbor network, Word vector
PDF Full Text Request
Related items