Font Size: a A A

Research And Optimization Of Association Rules Based On Can Tree

Posted on:2022-11-01Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhaoFull Text:PDF
GTID:2518306614967369Subject:Agriculture Economy
Abstract/Summary:PDF Full Text Request
With the advancement of the times,the development of computer technology and network technology,human beings have entered the era of big data.How to find potentially valuable regular information from massive data has important research significance and application value,so as to provide guidance for the development of all walks of life.In the study of association rules,it is found that the Can tree is suitable for incremental pattern mining,but because the Can tree contains all data items,the tree shape is too large and the mining time is too long,resulting in a low utilization rate of the Can tree.At the same time,under the single support confidence framework,the fixed screening threshold will make the mining results unsatisfactory.Setting the threshold too low will result in the exploding of the mined frequent itemset combinations to filter out too much useless information,or set the threshold too high to mine too little information.In order to solve this problem,this paper takes the sales data of a supermarket as the research object.Since the most important part of association rules in the actual process is how to find frequent itemsets,this paper will focus on how to efficiently mine frequent itemsets to carry out research based on Association rule algorithm for Can tree.The main research contents are divided into the following three parts:1.To improve the Can tree,a frequent pattern mining algorithm based on Can tree traversal optimization is proposed.For a path through the shared stack,when traversing upward along the projection,the sub-conditional pattern bases of all items on the path are obtained and saved respectively.The tree structure is improved,and the utilization rate of the stack is also improved.Although the number of stacking and popping will not be reduced,the termination conditions are clarified and the time for searching the remaining untraversed trees is reduced.It is realized that only one traversal is required for a path,which greatly reduces the running time.2.Add multiple minimum support.The formulas are calculated separately for different items,which balances the differences in the number of different items,and obtains the multi-minimum support to ensure the validity of the mining results,and the design of the multi-minimum support follows the anti-monotonicity and does not cause additional overhead.3.In order to achieve the purpose of quickly mining new items,incrementally design the Can tree,compare the construction time and mining time,and verify the algorithm.
Keywords/Search Tags:data mining, association rules, Can tree, conditional pattern base, multiple minimum support
PDF Full Text Request
Related items