Font Size: a A A

Research And Application Of Apriori Algorithm Based On The Compressed Matrix

Posted on:2018-06-02Degree:MasterType:Thesis
Country:ChinaCandidate:Q Z ZhangFull Text:PDF
GTID:2348330542491449Subject:Systems Science
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of information technology,there has produced a large amount of data and the accumulation of the data is growing in exponential.There are a lot of knowledge and information hiddedn in the data.In this context,Data Mining technology produced and has been widely applied in all walks of life.Data Mining provide a strong support basis for decision makers to make decisions.Association Analysis is one of the most important research area of data mining,also is one of the most widely applied area.It has a very high research significance.Therefore,it is worth studying to improve the operational efficiency and effectiveness of association rule mining algorithm.Apriori algorithm is one of the the most classic and important of association rule mining algorithm.However,the low efficiency of Apriori algorithm is the biggest problem,especially dealed with large data sets.For this reason,this paper proposed an algorithm based on k-means clustering and compressed matrix of weighted association rules apriori algorithm—K-means Clustering & Compressed Matrix(KCCM)algorithm to improve the operation efficiency of the algorithm.At first,K-means algorithm was applied to divided the big data set into several blocks.Then,a distance between itemsets was defined as the reciprocal of frequency of two items.By the definition,it was easy to assign the items which had the associated relationship into same class.Next,this paper dealed with the data in the form of 0 and 1,and mapped in matrix.Then applied KCCB algorithm in data set which has been divided into several blocks.Through the way of matrix compression,it was more effective to get frequent itemsets and strong association rules.Different from the traditional framework measurement of “Support—Confidence” model,this paper proposed the concept of “Lift”,and adopted a new framework measurement based on “Support—Confidence + Lift” model,in order to improve the effectiveness of the algorithm.To avoid the weight greater than or equal to 1,this paper had normalized the weight.At last,the mathematical software MATLAB was applied to simulate KCCM algorithm,and verified the effectiveness and efficiency of KCCM algorithm through the results.
Keywords/Search Tags:Data Mining, K-means Algorithm, Association Analysis, Apriori Algorithm, Boolean Matrix
PDF Full Text Request
Related items