Font Size: a A A

Research On Embedded Frequent Subtree Mining

Posted on:2009-09-21Degree:MasterType:Thesis
Country:ChinaCandidate:B LiuFull Text:PDF
GTID:2178360245989265Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of computer and information technology, a great amount of data is accumulated in daily work and in scientific research. How to extract useful information from these data is a great challenge for today's researchers in information science. Data Mining appears in this situation.Recently, data mining and its applications have already come into many disciplines and achieved plentiful fruits in many fields, including artificial intelligence, data warehouse, pattern recognition, bioinformatics, etc. Frequent pattern mining is an important issue in data mining domain. It involves mining transactions, sequences, trees and graphs. As a kind of special structure, Trees have their feature and advantage, so we take frequent subtree mining as the subject of this paper. The main contents of the paper are as follows:Firstly, we investigate data mining and frequent pattern mining concepts and principles and formulate the concept of subtree pattern. We also study canonical form of unordered trees and present pattern growth framework which is based on pattern growth mining strategy and unordered subtree mining strategy.Secondly, an efficient pattern growth algorithm is presented for mining frequent embedded subtrees in a forest of rooted, labeled, and unordered trees. It uses a canonical form to represent unordered trees in a unique way. It creates a projection database for every growing point of the pattern to grow. And then the problem is transformed from mining frequent trees to finding frequent nodes in the projection database. Experiments showed that it has good performance.Thirdly, we study the concept and property of weighted support degree. And then we propose another algorithm to mine weighted embedded subtrees. It uses both growing-up and growing-down method to generate frequent subtree patterns. Finally, we did experiments to prove correctness and validity of the algorithm.
Keywords/Search Tags:Frequent subtree, Unordered tree, Embeded subtree, Weighted subtree, Weighted support degree
PDF Full Text Request
Related items