Font Size: a A A

The Research Of Vertical Search Engine Based On Nutch For Food Safety

Posted on:2015-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:Q F CaoFull Text:PDF
GTID:2268330428964461Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Internet information overload caused people more and more rely on searchengine.For a professional theme or discipline of the vertical search engines is anextension of the search engine and subdivision. It provides vertical search for specificusers. At present, we usually use Baidu or Google to search food safety information. Itcan’t meet the needs of users to locate information faster and accurately. In the field offood safety, It is not so much for the user to provide relevant research. According tothe general search engine can’t satisfy the user for searching information on foodsafety, this paper designed a vertical search engine based on food safety information.Based on food safety information subject, the main research and innovation havedone:(1)Web sorting algorithm is the key of search engine. Web sorting algorithm insearch engine is putting the related and authoritative pages in the front of massivepages search results and helping users to locate the information rapidly. Sortingalgorithm in Nutch search engine is a basic ranking synthetical model, It can’t meetthe needs of professional users at specific areas. This paper improves the PageRankalgorithm and adding it to the ranking model of Nutch to make it more themetendency. The algorithm has three point improvement. Firstly, To assign differentweights according to different site. Secondly, adding time attenuation factor to reducethe old Webpage score. Finally, the theme correlation factor and web authoritativefactor are added into the Nutch web pages scoring formula. Experiments havesuggested that the improved algorithm can increase the veracity of search results,which can applied to practical life.(2) Research on the basic theory and the key technology of Theme Crawler, It isput forward to the theme crawler model based on food safety. The combined byartificial expert and search engine mode to choice authoritative pages as the initialURL. Establish a food safety information lexicon by keyword extraction. Use VSMmodel to judge the relativity of the theme.(3) According to the above research, this paper design and implement a foodsafety information search engine. It is included theme crawler and webpage scoringmechanism. Finally, it provide a fast and accurate search engine in the field of foodsafety information.
Keywords/Search Tags:search engine, food safety, similarity, vector space model, PageRankingalgorithm
PDF Full Text Request
Related items