Font Size: a A A

Research And Realization Of Search Engine For Website

Posted on:2008-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y L LuFull Text:PDF
GTID:2178360242467964Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid increment of Web information, how can people accurately withdraw the information needed from a mass of Web information quickly and availably, have become the problem that urgently need to be resolve at present. The technique of Search Engine is one of the valid approaches which can help people search a mass of Web information. Currently,Search Engine has become a research hotspot of Information Retrieval Domain. Search Engine is a software system applied to the Web, it collects and discovers information on the Web by certain strategy, providing a query service about Web information after processing and organizing the information.For meeting the need of Information Retrieval on incremental websites, this paper applies the technique of Search Engine to Information Search on Web sites, adopting the architecture of the Robot Search Engine, Implementing crawler based on Robot, carrying out the Inverted Index Database with the improved Lucene.net package, providing the friendly user interface with ASP.NET web pages.In the process of information collection, we use the Bloom Filters algorithm to eliminate these repeating URLs efficiently, and depending on the multi-threading technique to complete parallel crawl. In the implementation of Information Indexing, we improve the index efficiency by an Chinese word segment based on the string match and the double character hash indexing mechanism. In the realization of Information Retrieval, we give a new web page priority rank algorithm that combine the web page content analysis with the web page link analysis, and it is implemented by improving the base rank algorithm of Lucene.net.The Search Engine for website is given by this paper provides a well universal web sites search engine which is easy to deploy and customize for users, so that it is valuable in theory and practice.
Keywords/Search Tags:website, Search Engine, robot, Lucene.net
PDF Full Text Request
Related items