Font Size: a A A

Ming Frequent Free Tree Based On Sequence Pattern

Posted on:2009-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:S J SunFull Text:PDF
GTID:2178360272474101Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data Mining extracts latent, unknown, valuable knowledge or pattern from the large databases or data warehouse. After Data Mining concept was put forward 10 years, data mining technology has increasingly concentrated and applicated in a wide range of areas. Therefore, as an important branch of data mining, mining frequent itemsets and the frequent association rules is attracting a wide range of concentration and deserved larger research and development.As the areas of application on which Data Mining has been applicated are expanding and the kinds of data types are rising, particularly the development of network technology,the Date Mining technology which oriented traditional structural relational database can't meet the new require of the non-traditional data types,for instance,semi-structured and unstructured data.But these non-traditional data types applied widely in bioinformatics, Web mining, structural analysis of compounds.In this paper, the research point is on the unstructured data—-tree data types and acyclic graph data types.So, the main tasks in this paper are following:Firstly, the paper in depth described and analyzed the Data Mining related background knowledge and focused on the important branch of Data Mining—association rules.Then,the paper illustrate the different types of association rules and comprehensively and in depth introduce frequent itemset.Secondly,the paper classified the algorithm of tree-oriented structure data mining, and compare the two categories algorithms'efficiency, come to the conclusion that the depth priority algorithms are more efficient. So this analysis underlied the paper reaserch and identified the entry point.Then,The author used depth-first,vertical search way based the above analysis.At last,the author deeply analyze the two classic algorithm—-Tree Miner and FreeTreeMiner which represented the depth-first,vertical search way.The author also analyze and summarize superiority and weakness of the two classic algorithms.Next,the author programed the whole algorithm for the acyclic graph and free tree and divided the it into four steps:(1)Finding the free tree center.The author propose the new efficient algorithm called LWA and prove the validity for it in the chaprter 4.2.(2)Canonicalizing the rooted disorder tree.The author bring forward the new efficient algorithm called Canonicalization in section 4.3.Then,the author analyze the time complexity and prove it is similar with the most efficient algorithms in this area. (3)Mining the frequent sequence.The author introduced the improved isomorphism idea. By virtue of the new thought,the author enhance the efficiency dramatically. (4)Ming the different sub-tree with the same prefix sequence.The author introduce the idea which was put forward in the Chopper algorithm to mine the sub-tree.Finally,the author compared the SFTM (SequenceFreeTreeMiner) algorithm proposed in this paper with Chopper algorithm and FreeTreeMiner algorithm using experiment and proved SFTM algorithm was valid effective and efficient.
Keywords/Search Tags:Data Ming, Free Tree, Frequent Sub-tree, Frequent Sequence, TDB
PDF Full Text Request
Related items