Font Size: a A A

Research On XML-based Search Engine

Posted on:2007-06-02Degree:MasterType:Thesis
Country:ChinaCandidate:J J YaoFull Text:PDF
GTID:2178360185459285Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Today, Internet has already become an information bank that has most resource, most kinds and largest scale since the down of human civilization. Search engine is a very important Internet tool in retrieving information. However, it is very difficult that user obtain information rapidly and exactly from Internet with search engine. The most of traditional search engine are base on HTML whose feature restrict search engine. Now, another extensible language XML grow up slowly. The great mass of information on the Web will be described, stored and expressed with XML in the future. The tag of XML is abundant, search engine can find information rely on the relationship of tag and content, and so enhance the accuracy of search engine. Under this developing background, we try to research on search engine based on XML.First, we introduce XML in contrast with HTML which tell us the reason why search engine can good deal with XML. We also introduce some technology about search engine and make some improvement on Chinese word segmentation.Second, we design the framework of the search engine based on XML which comprises robot module, switch module, parse-index module and query module. In this article, we describe structure of every module in detail.Finally, we introduce how to implement parse-index module. The parse-index module contains parser and indexer, we design index for structure and content of XML document and specify the method of index.
Keywords/Search Tags:Search Engine, XML, HTML, Word Segmentation
PDF Full Text Request
Related items