Font Size: a A A

Research On Association Rules Mining Algorithm

Posted on:2020-08-25Degree:MasterType:Thesis
Country:ChinaCandidate:C M XiangFull Text:PDF
GTID:2438330620455593Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Association rule mining is one of the most important research directions in many research fields of data mining.Its application fields include applied economics,education,management,information science and medicine,etc.,which has high research value and practical significance.In order to quickly discover the potential and interesting rules between different projects in the transaction database,this paper mainly studies the following four parts related to the association rule mining algorithm:(1)Serial frequent itemset mining algorithm.Based on the in-depth study of Apriori algorithm and Eclat algorithm,this paper proposes an improved Eclat algorithm,namely IEclat algorithm,which uses optimized candidate set.By comparing and analyzing the simulation results of Apriori algorithm,Eclat algorithm and IEclat algorithm,the accuracy and effectiveness of IEclat algorithm are proved.(2)Parallel frequent itemset mining algorithm.Based on the research of Hadoop platform and its important components,this paper proposes a parallel frequent itemset mining algorithm with IEclat algorithm as the core algorithm and MapReduce framework as the computing model,namely MR_IEclat algorithm.By comparing and analyzing the simulation results of MR_Eclat algorithm and MR_IEclat algorithm,the accuracy,validity and scalability of MR_IEclat algorithm are proved.(3)Method of measuring interest.This paper introduces the interest measure by analyzing the problems existing in the traditional "support-confidence" model,and proposes a new measure of interest,namely "correlation interest".By comparing with four typical interest measure metrics such as full-confidence,confidence,promotion,and odds ratio,the properties and advantages of correlation interest are summarized.(4)Association rule mining model.This paper analyzes the interestingness of association rules by calculating the values of four typical interest measure and correlation interest,and draws the conclusion that using a single interest measure can not obtain more interesting association rules.By combining traditional models and multiple interest measure methods,the model of "support-confidence-multi-interest" is proposed.By comparing and analyzing the simulation results of the traditional model and the new model,it is proved that using the new model to mine the association rules can delete the redundant association rules,and the mining association rules are more interesting than the association rules excavated by the traditional models.
Keywords/Search Tags:Association Rules, Frequent Itemset Mining Algorithm, Interest Measure, Hadoop Platform
PDF Full Text Request
Related items