Font Size: a A A

Research On A Specialized Search Engine Based On Web Community Recognition

Posted on:2007-10-25Degree:MasterType:Thesis
Country:ChinaCandidate:X GuoFull Text:PDF
GTID:2178360182493808Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The rapid development of the Internet brings about the explosion of the contents on the WWW. This however makes the retrieval of the searched information difficult. Keyword-based search usually returns a massive amount of irrelative results which overwhelm useful information. New search engine technology is in exigent demand.In the Internet, websites specialized in a certain domain exist in the form of Web Community with the distinctive characteristic of scale-free network. In a scale-free network, the distribution of the connections of the nodes is very steady and approximately satisfies a certain power law distribution. By setting up a specialized Web Community, we could improve the efficiency of retrieving the needed information from the Internet.In this thesis, the author aims to design and implement a specialized search engine based on Web Community recognition. The system selectively grubs the web pages from the Internet by specialized web spider and analyses the professional relativity the page contents with an efficient approach, in which a relatively complete Web Community will be constructed with the expansion of the covered scope.The thesis starts with an introduction to the background and related technology of Web Community, which is followed by the details of the system design and implementation thereafter. The system design issues concentrates on the following two components: a specialized web spider and an semantic-based algorithm for efficient Web Community recognition. In chapter 3, the thesis mainly expatiates on the implementation of the professional web spider and its optimization. Then the thesis focuses its discussion on putting forward a kind of web theme recognition algorithm— WKHR (Weighted Keyword Hierarchical Recognition) in chapter 4.Finally, the thesis concludes with a summary of present and future of the application of Web Community in the search engine.
Keywords/Search Tags:Search Engine, Web Spider, Search Strategy, Completion Port, Theme Recognition, Web Community
PDF Full Text Request
Related items