Font Size: a A A

Research On Core Technologies Of Full Text Retrieval In DM DBMS

Posted on:2008-08-16Degree:MasterType:Thesis
Country:ChinaCandidate:H HuangFull Text:PDF
GTID:2178360272969509Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Many database management systems (DBMS) have implemented the full text retrieval subsystem, using which the client can create the context index on the text column stored in DBMS and then retrieval the text.DM4 DBMS also has implemented this function, but this full-text retrieval system has some serious problems, such as the space on storing the inverted index is too large, the time spending on filling index is too long. On the base of analysis of these problems, the solutions is given, including merging and compressing the inverted index in main memory, using more power algorithm to segment the sentence, plus the noise words filter ability. To control and improve the index filling process, the parallel and multiply index process method is given. To rank the retrieval result, vector space model is used and modified. In order to suite these design, new inverted index structure is given.At last, to watch the effect of these designs, some experiments are given. From the result of these experiments, the byte-level variable integer compression method can reduce the space of invert indexes, without bad effect on the index filling speed. The multiply and parallel index processes have not obvious improvement on the index speed. The modified vector space model has the ability to rank the retrieval results. And this design successfully solves the space problem.
Keywords/Search Tags:database, full text retrieval, inverted index, vector space model
PDF Full Text Request
Related items