Font Size: a A A

Research On Mining Algorithms Of Uncertain Data

Posted on:2016-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y M LiFull Text:PDF
GTID:2308330476953456Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology, vast amounts of data and records are produced in areas like finance, logistics and astronomy. In most cases, the data exists errors or may only be partially complete. Because of the uncertainties in the data, the traditional data mining methods are no longer suitable for mining uncertain data.This paper studies frequent pattern and max-pattern of uncertain data mining algorithms and proposes new algorithms respectively. The research enriches means of data processing and improves efficiency of mining.Considering the frequent pattern mining is the core issue in the field of data mining, this paper presents a frequent pattern of uncertain data mining algorithm ProEclat, which is based on vertical structure. ProEclat uses the vertical format of dataset, thus avoiding multiple scans. Moreover, in order to improve computational efficiency, a two-stage model is applied to judge whether a candidate set is frequent quickly. The result of experiments shows that ProEclat is scalable and performs better than similar algorithms.As an important research branch of frequent itemsets mining, max-pattern mining is also studied in this paper. We propose a depth-first algorithm U-GenMax, which can mine the maximal itemsets of uncertain data efficiently. U-GenMax uses unique pruning optimization techniques such as multi-step fallback, mechanism, item ordering policy and local projection. With experiments, U-GenMax is proved to be efficient and especially suitable for all kinds of sparse datasets and dense datasets with high support.
Keywords/Search Tags:data mining, uncertain data, max-pattern, frequent pattern
PDF Full Text Request
Related items