Research On Mining Frequent Itemsets Algorithm Based On Bittable

Posted on:2013-02-05

Degree:Master

Type:Thesis

Country:China

Candidate:S S Feng

Full Text:PDF

GTID:2218330362962841

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Frequent Itemsets Mining (FIM) is one of the most crucial and essential problems inthe field of data mining, there have already been many researches for it at present.However, three main bottlenecks exist when frequent itemsets are mined from large densedatabase. First, the speed of processing data is not high. Second, large number of space isoccupied in the mining process. Third, the frequent itemsets mined is not overall and someusers interesting itemsets can not be mined efficiently. To solve the above problems, threealgorithms based on BitTable are proposed to mine users interesting frequent itemsets,respectively. The research results can be widely used in the market analysis, customer'spurchasing behavior forecast and stock analysis and so on.Firstly, in order to improve the speed of processing data, this paper proposes a newalgorithm based on BitTable, BTFIM, to mine frequent itemsets. The transaction databaseis compressed into BitTable, and the support of every single item is counted. Then thecandidate itemset tree, CITree, is built level by level. The linked-frequent-itemset isdetermined, meanwhile, the frequency of the non-linked-frequent-itemset is tested todelete the infrequent itemsets, while each level is constructed. Finally, the CITree istraversed in breath-first search stragegy to record all frequent itemsets.Secondly, to reduce the memory usage in the mining process, a novel algorithm basedon Index Array, Index-FCI, is proposed in this paper for mining frequent closed itemsets.The transaction database is scanned once and compressed into BitTable. Then the IndexArray corresponding to the database is constructed through the BitTable. Afterwards, thegreat frequent itemset, GFI, and frequent itemsets are found from the Index Array. Finally,all of the frequent closed itemsets can be found by using the hash checking strategy.Lastly, to mine user interested itemsets, an algorithm based on BitTable and IndexArray, Index-BTFITI, is proposed to mine frequent inter-transaction itemsets. First, thedatabase is compressed into BitTable. Second, the representative item and thecorresponding subsume index in Index Array can be calculated using the horizontalBitTable. Final, the element of Index Array is classified to find all frequent inter-transaction itemsets.Our algorithms are implemented with C++language. Experimental results show thatthe algorithms proposed in this paper are more effective in resolving their respectiveproblems than the current ones, and the desired goals are realized.

Keywords/Search Tags:

frequent itemset, frequent closed itemset, frequent inter-transaction itemset, BitTable, Index Array

PDF Full Text Request

Related items

1	Research And Realization Of Parallel Algorithm For Mining Frequent Closed Itemsets
2	The Research And Application Of Association Rules Mining Algorithms Based On Directed Itemset Graph
3	An Algorithm For Mining Frequent Itemsets From Data Streams
4	The Research On IDS Based On Mining Max Frequent Itemset Using Big Step Pruning Strategy
5	Research On Weighted Frequent Itemset Mining In Uncertain Databases
6	Study On Associative Classification Based On Closed Frequent Itemsets
7	Research And Application Of Frequent Itemset Mining Algorithm
8	Research On Parallel Frequent Itemset Mining Algorithm Based On MapReduce
9	Research On Frequent Itemsets Mining Algorithm In Data Stream
10	Research On Mining Frequent Itemsets Over Data Stream