The Applied Research On Double Ordered Inter-Relevant Successive Tree

Posted on:2009-04-21

Degree:Master

Type:Thesis

Country:China

Candidate:M Q Yan

Full Text:PDF

GTID:2178360272459386

Subject:Computer and IT

Abstract/Summary:

PDF Full Text Request

With the fast development of Internet and exponential increment of information, people could not quickly find what they want on the sea of information and data. But the appearance of Full-Text database greatly improves the situation.Today, there are many common and popular Full-Index models, such as Signature Files, Bit Map, Inverted List, Pat-tree and Pat array. But each of them has some limits. IRST Full-Index Model is a new full-text retrieval model, the research production of Chinese language characteristic. This model has so many advantages, such as fast creation speed, fast search speed, great spatial efficiency in creating full-text index and rebuild original text with full-text index. From basic model, to ternary model, to successive ordered model, the creation and query efficiency have developed greatly. As an excellent full-index model, it was widely used on frequent item sets mining, correlative rules mining, test filter and so on. The latest research result of IRST model is Double Ordered IRST Full-Index Model, shorter form DIRST.Based on DIRST, some research about the longest common substring searching and frequent item sets mining are addressed in the thesis. Major contributions of this thesis include:1) Longest Common Substrings Searching on DIRSTFollowing methods, Dynamic Programming, Generial Suffix Tree and Generial Suffix Array, are usually used to search LCSs. Here use DIRST for searching LCSs. Create the index with the speedly creation method, and then search all LCSs with DIRST and original text. The experiment result shows that it is more efficiency than the one with Generial Suffix Tree.2) Frequent Item Sets Mining on DIRSTIRST has been used on frequent item-sets mining since it put forward. Here, based on above LCSs searching method, use DIRST to find directly frequent item-sets, and then get indirectly ones from directly ones. Finaly, combine both of them, get frequent item-sets, whose frequency are greater than minimum support value. Then, improve the method to get indirectly item-sets. This reseach is a try for DIRST on frequent item-sets mining.

Keywords/Search Tags:

DIRST, Longest Common Substring, Frequent Item-sets Mining

PDF Full Text Request

Related items

1	Approximate Longest Common Substring Matching And Optimization Techniques With Edit Distance Constraint
2	Search Of Algorithms For Mining Maximum Frequent Item-sets
3	A Frequent Item Sets Mining Algorithm With Constraint
4	Research On Mining Algorithms Of Maximal Frequent Item Sets
5	Mining Of Maximal Frequent Item Sets Based On AFOPT
6	Based On The Maximum Frequent Set Data Mining Association Rules Algorithm
7	Research Of Closed Frequent Item Sets Mining On Distributed Environment
8	Improvement Of Frequent 1-Item Set Generation Method And Experimental Study
9	A Frequent String Mining Algorithm Based On Optimized LCP Table
10	Research And Improvement The Algorithm Of Mining Frequent Item Sets In Text Association Analysis