Research On Mining Maximal Frequent Itemsets

Posted on:2017-03-07

Degree:Master

Type:Thesis

Country:China

Candidate:X J Zhang

Full Text:PDF

GTID:2308330503979767

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the advent of big data, data mining is more and more important for it can turn data into valuable information. Discovering association rules is the most important task of data mining. And mining frequent itemsets is the main step to find association rules. However, not only the large number of frequent itemsets, there exists the problem of information redundancy. Therefore, the maximum frequent itemsets come into being. The maximal frequent itemsets could compress the frequent itemsets and ease the pressure on data storage. In addition, maximal frequent itemsets play an important role in practical application, such as Web mining and DNA analysis. How to improve the efficiency of mining maximal frequent itemsets a significant research direction.This paper starts from the background and significance of mining maximal frequent itemsets. Through the domestic and foreign research, it is obvious found that algorithms of mining maximal frequent itemsets are on the basis of Apriori and FP-Tree. But Apriori-based algorithms generate many useless candidate itemsets during the process of mining frequent itemsets. In order to compute the support of itemsets, it is a waste of time to scan the database multiple times. And with the increase of width and depth, building FP-Tree will consume a great deal of time and space. Based these issues, we study the relevant problems in the process of mining maximal frequent itemsets. Review the search strategy and pruning strategy which will be used in the course of mining maximal frequent itemsets. Study four classic algorithms for mining maximal frequent itemsets to provide theoretical basis.GBMFI(Go-Back Maximal Frequent Itemsets)——an algorithm to mine maximal frequent itemsets is proposed. It uses the enumeration tree as search space to overcome bottom-up and top-down search strategy by depth-first search strategy. The database is presented in vertical format to reduce the computational complexity. And GBMFI uses infrequent subsets, frequent supersets and the support of parent and children to prune the tree. Compared with DepthProject, the effectiveness of the GBMFI is demonstrated by experiments on 4 data sets. And we make a research on the relationship between the minimum support and the number of maximal frequent itemsets. The results show that the algorithm works well. In addition, a parallel algorithm AMI for mining maximal frequent itemsets is proposed. Despite the poor results, AMI indicates a direction for further research.

Keywords/Search Tags:

Data Mining, Maximal Frequent Itemsets, Backtrack Search, Pruning

PDF Full Text Request

Related items

1	Research On Mining Algorithm Of Maximal Frequent Itemsets
2	Study On Frequent Pattern Mining Algorithms And Pruning Strategies
3	Research On Algorithms For Mining Maximal Frequent Itemsets
4	An Algorithm Of Maximal Frequent Itemsets Mining Based On Dynamic Reordering
5	Research On Algorithm For Mining Maximal Frequent Itemsets Over Data Streams
6	The Research And Implementation Of Mining Frequent Itemsets Algorithm Over Streaming Data
7	Research On Algorithm For Mining Frequent Itemsets Of Uncertain Data
8	Research On Key Algorithms For Mining Frequent Patterns In Data Streams And Their Application In Simulation System
9	Research On Key Algorithms For Mining Frequent Patterns In Data Streams And Their Application
10	FP-Tree Based Mining Frequent Itemsets Over Data Streams