Font Size: a A A

Research And Implementation Of Healthy Vertical Search Engine Based On Improved Shark-Search Algorithm

Posted on:2021-01-30Degree:MasterType:Thesis
Country:ChinaCandidate:H ChenFull Text:PDF
GTID:2428330602978108Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid economic development and the improvement of people's living standards in recent years,health problems have also received more and more attention.When using the current traditional search engine to search for information in the health field,there is often a lot of advertising information in the search results,and the professionalism and authority are poor.In response to this problem,this paper implements a vertical search engine in the health field based on the improved Shark-Search algorithm.The main tasks of the paper are as follows:(1)Improve the deficiency of Shark-Search algorithm.Aiming at the insufficient use of link context calculation by Shark-Search algorithm to result in the negative impact of noise links on the judgement of theme links,the link context is changed to use the title of the web page for calculation,other calculation factors remain unchanged;for the "myopia problem" of Shark-Search algorithm,It is proposed to combine Shark-Search algorithm with OPIC algorithm.Experiments show that the improved Shark-Search algorithm has improved the precision rate by 7.8%,14.1%,and 0.9%respectively compared with the Shark-Search algorithm,OPIC algorithm,and shark-PageRank algorithm,respectively in the recall rate(target recall rate).Increased by 11.8%,17.7%,2.9%.(2)Based on the improved Shark-Search algorithm,a crawler in the health field is realized,and a vertical search engine in the health field is developed based on the crawled data.Comparing this vertical search engine with Baidu and Bing search,the results show that the vertical search engine performs better on theme similarity in the top 100 web pages of the results when searching for healthy keywords.The innovations of this article are:1.In the Shark-Search algorithm,consider using the web page title instead of the link context to calculate theme similarity,to avoid the influence of noisy links on the judgement of theme links.2.It is proposed to combine the Shark-Search algorithm with the OPIC algorithm,which can not only improve the "myopia problem" of the Shark-Search algorithm,but also can eliminate the "theme drift" problem of the OPIC algorithm to a certain extent.
Keywords/Search Tags:health field, vertical search engine, Shark-Search algorithm, OPIC algorithm, theme similarity
PDF Full Text Request
Related items