Font Size: a A A

A Connection And Combination Based Research For Subtree Mining

Posted on:2010-09-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y E HeFull Text:PDF
GTID:2178360275496130Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent years, Data Mining(DM) has grown vigorously and its applications have already come into many areas and achieved plentiful fruits in varied fields including artificial intelligence and machine learning, database, bioinformatics, pattern recognition, neural computing and other fields.Being a semi-structured data mining problem, subtree mining is a relatively new branch yet has many applications in domains such as biology, Web data analysis and compound elementary analysis and so on. This thesis discusses the basic concepts and summarizes the current research status of subtree mining and then introduces a novel mechanism and its corresponding algorithms for frequent tree pattern discoveries which is based on connection and combination.First, a novel algorithm namely CCTree-Miner (Combination and Connection Tree-Miner) is presented. Instead of traditional pattern growth methods such as those based on pre-string equivalence class or right most path extension, CCTree-Miner connects between the leaf of frequent-2 subtree and the root of frequent subtree or combines the roots of frequent subtrees to produce super-candidate pattern of arbitrary size. The strategy of search space pruning is then discussed and the integrity of mining results is also proved.Secondly, an algorithm CTMiner (Closed Tree-Miner) for mining closed subtree patterns is presented which is an alteration of CCTree-Miner, to which, more powerful pruning techniques are added for higher efficiency. We analysis the experimental results of CTMiner and bring forward the future work.There are two main advantages of the algorithms introduced in this thesis: first, the database to be mined is scanned once and once only; secondly, we can make sure that every candidate tree pattern extended has at least one appearance in the database.
Keywords/Search Tags:subtree mining, frequent subtree, connect, combine, closed subtree, pruning
PDF Full Text Request
Related items