Font Size: a A A

Research On Concise Reprentation Of Frequent Itemsets Based On Fuzzy Equivalence

Posted on:2016-09-08Degree:MasterType:Thesis
Country:ChinaCandidate:J W XuFull Text:PDF
GTID:2308330473457047Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Generally there are huge amounts of frequent itemsets generated for extracting association rules during traditionally association analysis process. Therefore many models of frequent itemsets concise representations have been proposed to reduce frequent itemsets. But most of the concise representation models can reduce the number of the frequent itemsets which have not considered the error rate of the frequency as an important measurement. This restricts the application of association analysis in practice.To address the problems, the existing concise models have been deeply analyzed. On this basis, new concise representation model and algorithm which may reduce the number of frequent itemsets and the error rate of the frequency and may be influenced less by the deviation of data set would be designed.The main works of this thesis are as follows:(1) Considering the huge number of the frequent itemsets and the high error rate of the existing concise algorithms, a concise reprentation of frequent itmesets based on fuzzy equivalence is proposed first. Then, properties and theorems related to this model are analyzed and an alorithm based on the depth first search strategy called FECR is proposed. Emperiment results show that this model can significantly reduce the number of the frequent itemsets and can keep low error rate. Compared with Index-Meta, FECR has a lower error rate when both of them produce the same number of concise itemsets.(2) The error rate of frequency may potentially produce by the uncertainty of fuzzy equivalence, the analogy closed itemset frequency and the cluster threshod value. FECR algorithm cannot be used as the method which achieves the lowest error rate on the estimated frequency of the recovered fruequent itemsets. So, to reduce the error rate, three optimization problems have been investigated which contain the combination methods of fruequent itemsets, the estimation of the frequency of the fuzzy equivalence class and the value setting of the cluster threshold.According to the analyses above, optimization methods have been proposed.
Keywords/Search Tags:Frequent Itemset, Concise Representation, Fuzzy Equivalence Class, Analogy Closed Itemset, Optimization Method
PDF Full Text Request
Related items