Font Size: a A A

Algorithm For Association Rule Minning Based On Matrix And Graph

Posted on:2010-08-23Degree:MasterType:Thesis
Country:ChinaCandidate:X X WangFull Text:PDF
GTID:2178360278972410Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Knowledge discovery in databases is to use the computer to automatically extract information from the mass of useful knowledge,which is an effective new approach to the use of information,the database has now become one of the hotest areas of research.KDD research focus is on data mining.Social development has entered a network of the information age,various forms of data generated mass in the data behind a lot of important information.So how to find out from these data that certain laws and found useful information,more and more attention.In order to adapt to new demands and deal with various aspects of social development and the urgent need to develop a new analysis of information technology,which is called data mining.Data Mining is from many,not complete and there is noise and ambiguous,the practical application of random data,extracting implicit in one of the people do not know in advance,but are potentially useful information and knowledge process.At present,the main techniques of data mining are association rules,clustering,rough sets,neural networks and genetic algorithms and so on.Association rules are a reflection of affairs and other matters between the interdependence and correlation of data mining,which is very wide in the field of application.Association rules are found in different commodities trading database(of) the links between these rules to identify customer purchase behavior patterns,such as purchase of a commodity on the impact of the purchase of other goods.Found that such a rule can be applied to merchandise shelf design,availability purchase arrangements,and in accordance with the classification model to the user.The most classic association rule mining algorithms are Apriori algorithm.It is based on the improved algorithm by R.Agrawal et al in 1994 in the AIS.Apriori algorithm uses a technology called "floor by floor search of the iterative method",the core idea is based on the frequency of set theory as a recursive method of mining from the database of those who support and trust are not less than a given the minimum support threshold and minimum confidence threshold of association rules.Apriori algorithm is usually divided into two steps:1) based on the degree of support, resulting in frequent itemsets;2) Based on credibility,generate strong association rules.First of all,to find a frequent set collections,the collections recorded as L1, L1for the two sets looking for frequent collection of L2,L2to look for L3,it goes on like this until k Can not find the frequent itemsets Lk,identify each Lk required to scan the database once.However,Apriori algorithm has inherent disadvantages:(1) from k-1 frequent itemsets generated by self-connected candidate frequent k itemsets enormous quantity. (2) in verifying the authenticity of the candidate frequent k itemsets time required to scan the entire database is very time-consuming.Therefore,in order to solve the above problem,this article analyzes the Apriori algorithm application examples,and on this basis the improvement Apirori algorithms. Algorithm's basic idea is to first express itemsets into matrix form,the encoding matrix,and then after the use of coding required to generate itemsets plans to improve the efficiency of the purpose of pruning.The search for frequent itemsets generate an effective algorithm is key to the problem.In this paper,given the database of the matrix,which is said matrix itemsets according to the relationship between the formation of plans and effective to reduce the number of frequent sets,reducing the number of scanning database,improve the efficiency of the Apriori algorithm.
Keywords/Search Tags:Knowledge discovery in databases, data mining, Apriori algorithm, matrix express, graph
PDF Full Text Request
Related items