Font Size: a A A

The Research And Implement Of Index Technology In Search Engine

Posted on:2009-09-24Degree:MasterType:Thesis
Country:ChinaCandidate:B G WuFull Text:PDF
GTID:2178360245968622Subject:Information Science
Abstract/Summary:PDF Full Text Request
The paper mainly focuses on inverted index, which is one of the core technologies in search engine. The research topics in this paper include the inverted index organization, index construction, index compress, dynamic index maintenance, and storage technology. A prototype system is designed and implemented using above technologies for the purpose of experiments at last.The compression and encoding of inverted index can save storage space, reduce I/O traffic, and increase system throughout. The only shortcoming of compression will occupy some CPU time for decompressing when it be queried and updated, so you should tradeoff between compression and decompression.The paper introduced some kinds of hybrid coding thnologies,which efficiently tradeoff between compression and decompression .The dynamic index maintenance is adaptive to the situation of dynamic retrieval environment that the text collection is changed frequently. The paper compared several dynamic update strategies for their costs of index construction and maintenance, especially for on-line index maintenance. Index updates are interleaved with search queries in on-line maintenance environment.Distribute File System (DSF) is designed for large-scale data storage using the open software of Hadoop, which satisfy the demands for distribute data storage and fault-tolerant.
Keywords/Search Tags:Search Engine, Inverted Index, Index Compress, Index Maintenance
PDF Full Text Request
Related items