Font Size: a A A

The Applied Research On Double Ordered Inter-Relevant Successive Tree

Posted on:2009-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:M Q YanFull Text:PDF
GTID:2178360272459386Subject:Computer and IT
Abstract/Summary:PDF Full Text Request
With the fast development of Internet and exponential increment of information, people could not quickly find what they want on the sea of information and data. But the appearance of Full-Text database greatly improves the situation.Today, there are many common and popular Full-Index models, such as Signature Files, Bit Map, Inverted List, Pat-tree and Pat array. But each of them has some limits. IRST Full-Index Model is a new full-text retrieval model, the research production of Chinese language characteristic. This model has so many advantages, such as fast creation speed, fast search speed, great spatial efficiency in creating full-text index and rebuild original text with full-text index. From basic model, to ternary model, to successive ordered model, the creation and query efficiency have developed greatly. As an excellent full-index model, it was widely used on frequent item sets mining, correlative rules mining, test filter and so on. The latest research result of IRST model is Double Ordered IRST Full-Index Model, shorter form DIRST.Based on DIRST, some research about the longest common substring searching and frequent item sets mining are addressed in the thesis. Major contributions of this thesis include:1) Longest Common Substrings Searching on DIRSTFollowing methods, Dynamic Programming, Generial Suffix Tree and Generial Suffix Array, are usually used to search LCSs. Here use DIRST for searching LCSs. Create the index with the speedly creation method, and then search all LCSs with DIRST and original text. The experiment result shows that it is more efficiency than the one with Generial Suffix Tree.2) Frequent Item Sets Mining on DIRSTIRST has been used on frequent item-sets mining since it put forward. Here, based on above LCSs searching method, use DIRST to find directly frequent item-sets, and then get indirectly ones from directly ones. Finaly, combine both of them, get frequent item-sets, whose frequency are greater than minimum support value. Then, improve the method to get indirectly item-sets. This reseach is a try for DIRST on frequent item-sets mining.
Keywords/Search Tags:DIRST, Longest Common Substring, Frequent Item-sets Mining
PDF Full Text Request
Related items