Font Size: a A A

Research On Mining Optimization Based On Extraction Rules In Uncertain Data

Posted on:2017-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:J LiangFull Text:PDF
GTID:2278330488950188Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the information age and the computer technology, data storage and data diversity increased continuously. In order to gain useful knowledge, we through the data mining technology to handle these data, digging out the link between them, and association rules is a technology as one of an important research direction of data mining. Especially the sharp increase in Internet applications, with the characteristics of the distribution of discrete random variable probability uncertain data continue to emerge and become a major part of the database, and the size of the data storage unprecedented increase, which makes the original association rules mining methods can no longer adapt the data model now.Most of the traditional association rules need to scan the whole transaction database repeatedly, and it will take a lot of time to generate the candidate itemsets, and present algorithm are mostly focused on extracting conjunctive association rules like X(?)Y1Y2...Yk-1,for uncertain data, small probability event information may be easily lost in many realistic case, so that users will not master the more comprehensive knowledge and the connection between things. Conjunctive rules are to be optimized in time and space performance and efficiency.According to the shortcomings of the traditional algorithm and uncertain data random variable characteristics, this paper presents a kind of algorithm that mining uncertain data mining Disjunctive rules from uncertain database. Only need to scan the original database once, use the concept of fuzzy sets to select 2-frequent itemsets, and then extract all the disjunctive normal form through the 2-frequent itemsets, comparing the minimum support degree and minimum confidence, and last, extract all the useful disjunctive rules. Because of the same validation parameters of running time and confidence, this paper compare with two conjunctive rules mining algorithms in uncertain databases.This paper respectively compared the proposed DRUD algorithm with U-Apriori algorithm, UFP-growth algorithm in CHESS, MUSHROOM, T20I6D300K through JAVA, the experiment mainly compares the two index of the running time and confidence of the algorithm. The simulation results show that the confidence of the rules generated by algorithm DRUD has improved, and the running time of DRUD also has improved.
Keywords/Search Tags:Uncertain data, Disjunctive rule, Support, Data mining
PDF Full Text Request
Related items