Font Size: a A A

Frequent Patterns Mining For Uncertain Data Using Correlation Metric

Posted on:2019-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:P GaoFull Text:PDF
GTID:2428330545488449Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Frequent itemsets mining,as the most basic part of the data mining process,has always been one of the hot research areas.With the advancement of technology and the continuous expansion of application fields,the resulting data may be Wrong or incomplete in many practical applications,such as product browsing information of online malls,sensor networks,privacy protection,diagnostic data of doctors in hospitals,and satellite image data,so uncertainty data is ubiquitous in real life.Many mining methods have been proposed for certain data,but the computational and semantic differences between certain data and uncertain data make them unsuitable for uncertain data mining.How to find valuable knowledge in the uncertain database has attracted wide attention of many scholars.Due to an increasing demand for efficient algorithms for mining frequent itemsets from uncertain databases,it has become a hot research field.The main work is as follows:First,this paper studies the current classic association rules and frequent itemsets mining algorithms,and summarizes frequent pattern mining algorithms for uncertain data to explore the general ideas and methods of uncertain data pattern mining.This paper briefly introduces the source and the general processing model for uncertain data,and summarizes the current two models of frequent patterns mining of uncertain data to analyze those advantages and disadvantages of the model based on expectation support and the model of support probability.Second,in uncertain databases,most of the frequent item mining algorithms are utilizing the limitation of support degree to prune the combined search degree,thus the correlations of frequent itemsets they obtain are often very weak,moreover,the mining effect of weighted correlation model is not significant.In this paper,we proposed a new uncertain frequent pattern mining based on correlation metric(UFPM-CM)approach.A new tree structure and new metric in UFPM-CM are present to improve the mining performance.Besides,UFPM-CM propose a new uncertainty confidence metric to explore the phase correlation in the database.Through experiments on two types data sets of Mushroom and Kosarak,the proposed UFPM-CM approach generates fewer but extremely valuable patterns and is better than those of existing work.
Keywords/Search Tags:Datamining, Frequent patterns, Weighted pattern, Correlated pattern, Uncertain data
PDF Full Text Request
Related items