Font Size: a A A

Study On Association Rule Mining Algorithm Based On The Pattern Of Polar Clique

Posted on:2015-11-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y X ChenFull Text:PDF
GTID:2298330431483941Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Association rule mining is one of the important research contents in the field of data mining, through the adoption of support degree and confidence to remove the infrequent items and obtain the object association rule. In term of the dataset mining distributed with severe inclination, the traditional frequent item set mining algorithm can not get the efficient application on some important tasks, and the threshold value of support degree is hard to be determined. If it is too large, the rule of high confidence coefficient would be omitted, and if it is too low, plenty of redundant rules would be of low confidence coefficient. How to improve the efficiency of mining and obtain more accurate association rules is an important study project in the field of data mining.Maximum clique is the maximum fully connected component of undirected clique G, which aims at generating maximum clique with item clique of strong relevancy and may generate maximum frequent items and then solve the maximum frequent item clique of each maximum clique. And quickly generate all reliable association rules, promote the time efficiency.This paper study deeply on the operating principle, mechanism, maximal clique arithmetic and other theories of classic arithmetic such as APriori and FP-growth arithmetic, and concludes their advantages and disadvantages. Aimed at existing issues in current association rules, the main researching work is as follows:1. Aiming at issue that it is difficult to set appropriate threshold value of support degree to excavate the the data sets of project support degree with inhomogeneous distribution, weight credible association rule algorithm MCWCAR (Maximum Clique Weighted Credible Association Rule) based on Maximum Clique is proposed. Through defining the basic concepts of weight credible association rule and2-item weighted credible clique, and making use of2-item adjacent matrix, it generates2-item weighted credible clique and gains corresponding spares graph, then further acquires all connected component corresponding in graph. By analogy, until the last vertex is added to the maximal clique; For each connected component, the top k-1vertexes construct the whole maximal clique, then add the Kth vertex to (k-1)-maximal clique to get k-item weighted credible clique, and complete the excavation process of the weighting credible association rule of the maximal clique. It solves the problem that it is difficult to set appropriate threshold value of support degree by data sets with inhomogeneous distribution, and avoid scanning the database for many times and generating mode tree frequently, then reduce the calculated quantity of item-sets support degree. Finally, experimental results show the proposed algorithm has higher efficiency than the traditional algorithm of mining association rules in time performance and accuracy.2. Aiming at the low excavation efficiency, incomplete excavation methods and other problems of the long mode existing in the current data excavation, Clique Search With Dynamic Update Of Graph Based Top-N Maximum Pattern Mining Algorithm CSDGMPA is proposed. After putting forward2pruning rules, through pruning invalid clique and extension clique these two stages, accurately identify the Top-N maximum clique. And finally adopt depth-first branch-and-bound algorithm to seek the maximum mode with the length of Top-N. The algorithm proposed can find the target pattern emerging with the form of clique structured in the picture based on K-item model. With the dynamic rarefaction of figures, it is more efficient to find the clique, and the search process is optimized, meanwhile, pruning veracity is also improved. Finally, through simulation experiment, CSDGMPA algorithm has good advantage in aspects such as time and cost than traditional algorithms of MAXIA and LCM.
Keywords/Search Tags:Weighted Credible Association Rule, Maximum Clique, pattern graph, Top-N maximum pattern
PDF Full Text Request
Related items