Font Size: a A A

Research And Design Of Vertical Search Engine System

Posted on:2016-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:X M SongFull Text:PDF
GTID:2308330464467970Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the Internet into people’s homes, the amount of information on the network is also showing an unprecedented rapid growth, general search engines challenges in collecting information resources, web pages and other mass storage index is growing. At the same time, research and practitioners within a particular industry, hoping to get more professional, more thorough and more valuable information, people of different ages on-line information also has specific needs, where there is market demand from the Internet and service needs of these specialized search engines search, vertical search engines that the rapid development in recent years, has become a popular search engine research in the field direction. It provides specialized information retrieval services in a particular area, to better meet the user’s professional specialization and refinement of information query needs.This paper provides an overview of the overall design of the vertical search engine, describes the objectives to be achieved by the vertical search engine and introduces the principle of vertical search engines, the overall module vertical search engines have indexed pages information collection, web information, web information sorting and retrieval, in which the vertical search engine compared to the general search engine, focusing on topics related pages that collect information collection only when the page, which requires vertical search engine crawlers collect collection strategy according to certain pages. There are two traditional collection strategy, namely the search strategy and web-based content analysis of the link structure analysis based on search strategies, and they are a single search strategy, there are some shortcomings. This paper presents a comprehensive search strategy web content analysis and link structure analysis.Web content analysis of search strategies which have Fish-Search algorithm and Shark-Search algorithm, link structure analysis algorithm PageRank algorithm and HITS algorithms, this algorithm each of these four categories were introduced one by one, on this basis, to suggest improvements PageRank algorithm, and then put forward a comprehensive improved algorithm, hoping to guide crawlers to collect more and better topic pages.In order to test the improved algorithm is effective, the paper designed and implemented a web crawler software:VSE-Spider, this system is a multi-threaded crawl distributed on the Internet and web-related topics set the system crawlers were used traditional improved integrated web Search web search algorithm and the proposed algorithm, the experimental results on the VSE-Spider in each group were analyzed. The results show that the improved overall search strategy proposed in the collection of topics related pages have better performance.
Keywords/Search Tags:Vertical search engine, Inverted index, Spider, Search algorithm Correlation prediction
PDF Full Text Request
Related items