Font Size: a A A

Improvement Research On Inter-Relevant Successive Trees Model

Posted on:2010-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:R YangFull Text:PDF
GTID:2178360275991922Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of computer technology,information has become more and more massive and diverse.Traditional information retrieval technology is only good at managing structured data.Therefore,a new information retrieval technology,full-text retrieval technology is brought in to manage non-structured massive data such as massive text.In the passed decades,full-text retrieval technology has evolved from a string matching program to a tool that can manage kinds of non-structured data,such as massive text,voice,images,movies and so on.It is widely used in many fields such as digital library and search engine.The decisive factor in full-text retrieval is the full-text index model it used,which can provide efficient management and quick retrieval of non-structured massive data.Researches and achievements on a new full-text index model,Inter-Relevant Successive Trees Model(IRST for short),are introduced in this article.Some improvements made on Dual-sorted Inter-Relevant Successive Trees Model(DIRST for short),a new branch of IRST,are also discussed.DIRST is a new full-text index model that can describe itself,and is dual-ordered as well as compressive.To take advantage of the three characters of DIRST, following works have been done:1.To generate the original text with the index more efficiently,the data structure of DIRST has been improved,and a new theorem of successive numbers has been introduced and proved.2.A new verifying-bisearch algorithm has been introduced by adding a verifying process to the current backwards-bisearch algorithm for better querying performance.3.A new theorem of the search result distribution has been introduced and proved. Besides,the successive numbers as well as low bounds and high bounds of every two adjacent bytes have been changed to be linear in the new linear-bisearch algorithm. 4.DIRST has been proved to have better comprehensive performance than some other popular full-text index models after comparison on several aspects.5.A simple application of Text Index System based on IRST has been introduced. Experiments of this application have shown that the performance of the improved DIRST has been improved a lot.6.Ideas of further improvement on index creation and coding of successive numbers of DIRST are discussed.
Keywords/Search Tags:full-text retrieval, full-text index, Inter-Relevant Successive Trees Model
PDF Full Text Request
Related items