Font Size: a A A

Study And Application Of P2P-Based Distributed Chinese Search Engine

Posted on:2007-09-29Degree:MasterType:Thesis
Country:ChinaCandidate:B X DingFull Text:PDF
GTID:2178360185960946Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The technologies of P2P and about search engine now are scientific research academy and company's research hotspot. The P2P's distributed network structure possesses expansibility, haleness, load equilibrium and other trait. Compare with traditional distributed system, it possesses unexampled superiority. It network top structure is fit for dealing with the distributed information search. With the development of internet, as internet users' needed information search tool, the search engine gets more and more users recognition. Because the internal internet develops so fast, Chinese users soar. And researchers get along with research of computer to nature language comprehension, it makes search engine develop to Chinese search engine. At present, many search engines structure is concentrative: get web pages from internet, parse and deal the pages information, get index information, then put the index information into index database, the index database is centrally stored one web station, users commit queries and get search result by logon the web station. The concentrative search engine server's load is overweight. If a great lot of users logon the server to request search service at the same time, the server can't response in time and it can make network congestion. The concentrative search engine's search capability is quite limited; it can't search information deeply and widely. So researchers put forward strategy to construct the distributed search engine.At first this paper summarizes P2P technology and its background and introduces P2P network model, P2P searching thechnology, etc facets. Then this paper introduces briefly search engine and narrates search engine system's composition and theory, search engine's sort, capability guideline, development direction, distributed search engine, contrasts and analyses search engine's information searches model. To conquer concentrative search engine drawback, this paper combine search engine and P2P technology, then brings forward the P2P-based distributed Chinese search engine. The P2P-based distributed Chinese search engine adopts NetShot route algorithm as system route algorithm; This paper brings forward tree shape-based word warehouse to split Chinese words, the Chinese Segmentation optimized method makes traditional matching algorithm efficiency to get quietly advance, it combines XML to implements viable solution for Chinese Segmentation; This paper adopts XML-based inverted index algorithm based on B+ tree to build the index to solve the traditional positive index and inverted index real time updating drawback.The P2P-based distributed Chinese search engine constructs search engine on P2P distributed network structure, and makes use of P2P's good distributed trait to put search engine to develop from concentrative structure to distributed structure, makes search engine search users' useable information from internet deeply and widely. Basing on XML and tree shape word warehouse Chinese Segmentation method makes search engine to split Chinese article well and truly. XML-based inverted index solution explores Chinese and English mixed searches base methods from new point of...
Keywords/Search Tags:Peer-to-Peer, Search Engine, P2P-Based Distributed Chinese Search Engine, P2P Route Algorithm, Chinese Segmentation, Inverted Index, Information Search Model
PDF Full Text Request
Related items