Study Of Fast Algorithms For Frequent Itemset Mining From Uncertain Data

Posted on:2018-12-06

Degree:Master

Type:Thesis

Country:China

Candidate:Z Wen

Full Text:PDF

GTID:2348330533966150

Subject:Mathematics

Abstract/Summary:

PDF Full Text Request

Over the past decades, there have numerous classical algorithms on mining frequent itemsets from precise data. In recent years, due to the wide applications of uncertain data, the data mining techniques over uncertain databases has attracted much attention.Traditional algorithms, on mining frequent itemsets from uncertain data, are obtained by improving algorithms of mining precise data.TubeS-growth algorithm,which has very well compression performance sometimes.But when mining over the massive uncertain data it has some matters as follows :①When existence probabilities of items spreads over a broad or loose range,the algorithm will produce a lot of false frequent itemsets; ② When the algorithm is useed to mine the sparse uncertamn dataset which has massive items or the dense uncertain dataset in which the average length of affairs in dataset are long, it will run long time.For solving tow matters upon, this paper proposes a new mining algorithm by the ideal of divide and rule, namecd PtubeS-growth.This algorithm take advantage of science in database partition,when the main nmemory is incompatibility or the database is massive.The fist ,The database is divided into several sub database.The second,the algorithm begins mining locally potential frequent itemsets in every partition by mining constructed tree structure,and merging all locally into globally potential itemsets.The end.it is by passing the database to check out all false frequent itemsets,consequently guaranteeing accuracy of mining results.For guaranteeing the rationality of the improved algorithm,in the process of algorithm design,this paper puts forward and proves related theorems to solves matter as follow:③How the minsup of every partitions is seted rational after the database are partitioned;②How to merge all locally potential itemsets in every partitions into globally potential itemsets.For insuring the high efficiency of the improved algorithm,this paper uses some optimization methods such as pruning and reducing the amount of calculation,to solve some matters which are caused by mining after database partition and merging of locally potential frequent itemsets,such as long run time.Experiments show that the high efficiency of our proposed PtubeS-growth algorithm well,both the sparse or the dense uncertain data are mined,and this algorithm solves problems of tubeS-growth algorithm which caused when it is used to mine over both the sparse or the dense uncertain data.

Keywords/Search Tags:

Uncertain data, Tube S-growth algorithm, Frequent itemset, Tree-based structure, Expected support

PDF Full Text Request

Related items

1	Research On The Algorithm Of Mining Frequent Itemsets From Uncertain Data Based On The Tree
2	Mining Frequent Subgraph Based On Pre-clipping In Uncertain Graph Databases
3	Research On Uncertain Frequent Graph Data Mining
4	New algorithms for frequent sequential pattern and itemset data mining in certain and uncertain databases
5	Research On Weighted Frequent Itemset Mining In Uncertain Databases
6	An Efficient Algorithm Of Mining Frequent Subgraph Patterns In Uncertain Graph Database
7	Mining Algorithm Of Frequent Items Based On Item Adjacentcy List And Trasaction Tree
8	Research And Application Of Frequent Itemset Mining Algorithm
9	Research On Mining Frequent Itemsets Algorithm Based On Bittable
10	Research On Algorithm For Mining Frequent Itemsets Of Uncertain Data