Font Size: a A A

Research And Realization Of Search Engine For Website

Posted on:2010-11-14Degree:MasterType:Thesis
Country:ChinaCandidate:X C HuFull Text:PDF
GTID:2178360278457593Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Now the acceleration of the information-based makes more and more enterprises establish their own websites to provide their users with the information of their products and services. The following problem is that the massive and increasing website information always prevents us from getting the useful information effectively. Users cannot obtain the needed efficiently through web browser.But the Search Engine for website is efficient to solve this problem. So in this thesis we try to create a Search Engine for website by researching and discussing the Search Engine technology to solve the problem.Firstly, we introduce the present situation of search engine briefly to raise its necessity. Then we introduce the principle and classification of Search Engine, and particularly analyze and research the realization technology from its three parts, which are information collection, information processing and information retrieval. At last, we use bot and Lucene toolkit to design and realize a Search Engine for website based on the foregoing work.In the system realization, we use multi-thread technology to implement the parallel crawl, and use the Inverted Index technology to implement the web information index database. The finally system provides the retrieval service for webs, musics and pictures. In particular we introduce a new web page priority rank algorithm that combine the web page content and the depth of URL link, which is implemented by improving the base rank algorithm of Lucene,in order to obtain better results.
Keywords/Search Tags:Web Spider, Inverted Index, Multi-thread, Search Engine
PDF Full Text Request
Related items