Font Size: a A A

Research On Algorithm Of Mining Of Association Rules

Posted on:2006-09-06Degree:MasterType:Thesis
Country:ChinaCandidate:Z M ZhangFull Text:PDF
GTID:2168360155959991Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
At present, association rule, one of the most successful and crucial discoveries in data mining, has been an active research area.In the Apriori algorithm, C2 is generally the greatest one, wasting most of the time on generating frequency binomial set. This paper proposes an algorithm based on MAT (Matrix) to solve this problem. The main idea is to scan the database and do matrix counting according to the appearance of itemset in the transaction. Delete less than least support element and reach the frequency 1-set and 2-set. The algorithm reduces the number of times for database scanning and increases work efficiency.In the Kth (K>2) recurrence of Apriori Algorithm, all the Kth order sub-sets of each transaction T in the database have to determine whether they are in the Kth order candidate itemset. This paper presents a transaction prune and partition search based PPS (Pruning & Partition Searching) algorithm for efficiency. Transaction prune follows the subsequent strategy: if item tj is reserved in transaction T, then it must at least appear in the (K-1)th frequency itemset for (K-1) times; Otherwise, it will be cut in the iteration of reaching the frequency Kth itemset. Partition search follows this strategy: construct a data structure for quick search and location, and divide the (K-1)th frequency itemset sequence into a number of continuous partitions to obtain a number of disjoint sub-sets of (K-1)th order frequency itemset. The partition goes upon the first two items of the (K-1)th frequency itemset. The first array stores the ordered candidate itemset frequency counter. The second array records the location of each partition and every itemset with the same first two items will be stored in a continuous interval.Experiments and tests have been carried out based on the improved algorithm and Apriori algorithm. It is proven in the experiments that the operation period of MAT-PPS algorithm is far less than that of the Apriori algorithm, and the space of memory made use of is less than Apriori algorithm. Thus, efficiency has been greatly improved.In the research of multi-association rule, this paper introduces an improved FP-CH algorithm based on FP-Tree. In the same layer data mining of FP-CH algorithm, it constructs...
Keywords/Search Tags:Association rule, Mining algorithm, Transaction pruning, Partition searching, FP-tree
PDF Full Text Request
Related items