Font Size: a A A

The Techniques Research On Frequent Pattern Mining

Posted on:2008-04-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:1118360272459785Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
We face huge data now with the developing of computer technology and application.It is interesting to make better those data and to find out and mine the knowledge below data.Knowledge Data Discovery and Data Mining becomes one important field of computer technology.Frequent patterns mining is an important aspect of data mining and includes mining transaction,sequence,tree and graph.It is widely applied in other data mining research such as association analysis,period's analysis,maximal and closed patterns,query,classification and index technology etc.Since it lays groundwork for other problem and its intrinsic complexity,the algorithm for frequent pattern miming has become the focus of many research workers.This paper is focus on the research of frequent patterns mining and includes the Apriori algorithm for frequent patterns mining based inverted list, combination tree algorithm for frequent patterns mining and frequent closed patterns mining.1) Apriori Algorithm for Frequent Pattern Mining Based on Inverted List. Though traditional Apriori algorithm has good performance when mining short patterns and sparse data set,it has to scan data set many times and has bad performance when mining long patterns and dense data set.Aimed at these shortcomings,we improved Apriori algorithm and develop InList algorithm.InList algorithm insert item one by one and save frequent item into transaction-frequent-library,and when insert a new item it make new frequent item with transaction-frequent-library.This algorithm avoid lots of redundant operations, and don't need join and prune,only need scan data set twice.Because of those improvements,InList algorithm has better performance.2) Combination Tree Algorithm for Frequent Pattern Mining.Compared with Apriori algorithm and FP-growth algorithm,Combination Tree algorithm has better efficiency.Algorithm inserts new item one by one with inverted list of items to build frequent tree,then transfer count between branches and make all branches relatively independent.We can scan data set only twice,share more common items among transaction,omit locally infrequent items and avoid lots of recursive calls.Experiments show that for sparse and dense data set,our algorithm has better efficiency.3) Frequent Closed Patterns MiningFrequent Closed patterns have all information that frequent patterns have,while the amount of frequent closed patterns is much less than the amount of frequent patterns.Frequent closed patterns mining is a good choice when mining data set with huge amount of frequent patterns.Many algorithms for mining frequent closed patterns have to take lots of time to verify the frequent item is closed.We make better the advantage of combination tree.We can easily judge weather one frequent items is closed by traversal the combination tree and compare the count of relative node. Because of omit verification,our algorithm improves efficiency.
Keywords/Search Tags:data mining, frequent patterns, frequent closed patterns, inverted list, combination tree
PDF Full Text Request
Related items