Font Size: a A A

Research And Application Of Association Rules Algorithm Based On Undirected Graph

Posted on:2009-12-31Degree:MasterType:Thesis
Country:ChinaCandidate:Y W WangFull Text:PDF
GTID:2178360272463295Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, the developing of Data Mining techniques has been paid widely attention by information industries, which is the necessary result of the conflicting movement between the rapid-increasing data and the lacking of information increasingly. Deep researching of the data mining techniques is an objective requirement in the developing of the global information.Data mining, the key step of Knowledge Discovery in Databases, is the process of discovering implicit, nontrivial, pervious unknown and potentially useful information from databases. Association rules mining is an important field in data mining and has important applications in database. Successful applications of association rules mining have been demonstrated in marketing, business, medical analysis, product control, engineering design and scientific exploration. The main purpose of mining association rules is to find the connections or correlations hidden in enormous items which are also called knowledge patterns in database.A new algorithm based on undirected graph is proposed in this paper, which is to find the maximal frequent itemsets in transaction database, and studying clustering method based on the generated association rules. The innovations of this paper are as follows:(1) A new association rules algorithm based on undirected graph is proposed for discovering maximal frequent itemsets to find locally strongly correlated frequent itemsets in transaction database. Firstly, we transform horizontal transaction database into a vertical one and then save it to an adjacent matrix in which the weight of edges denotes support of 2-itmset. Secondly, the whole undirected complete graph is divided into several complete subgraphs based on the weight. Finally, the up-down and the bottom-up strategies are used to find frequent itemsets and compare efficiency of the two strategies according to different minsupport. Experiments show that the new association rules algorithm has the greatest efficiency when the minsupport is low.(2) All association rules must be pruned or grouped and both of them in order to identify valuable information from large quantities of discovered association rules. An improved distance metric method between rules is presented and clustering these rules based on the new distance. First, computing distance between items. Second, computing distance between rules. Last, a density-based clustering algorithm DBSCAN is used to cluster these rules. The results are discussed and discovered outliers accurately.Experiments based on UCI databases show that the new association rules algorithm and the improved distance method are of great efficiency and practicability.
Keywords/Search Tags:Data Mining, Association Rules, Frequent Itemset, Undirected Graph, Clustering
PDF Full Text Request
Related items