Font Size: a A A

Research And Application Of Uncertain Data Mining Technology

Posted on:2017-01-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2348330503995756Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The study of uncertain data has attracted many attentions with the rapid development in uncertain data gathering and processing. The data show the inherent uncertainty because of the effects of measuring errors, the accuracy of equipment and data security measures, such data is known as uncertain data. Most data mining technology primarily focuses on the accurate data. It ignores the probability of uncertain data when it is applied to the uncertain data, which makes a poor effect. So how to quickly find valuable information from uncertain dataset is a popular research topic.On the basis of analysis of existing system problems and requirements, the thesis designs a data mining system based on uncertain data, and focuses on outlier detection technology and frequent itemset mining technology of uncertain data mining. Considering about the characteristic of uncertain data in two dimensional space, a top-k distance-based outlier detection algorithm VGUOD and a tree-based approach WCUFP-Mine for weighted frequent itemset mining from uncertain data are proposed. In existing outlier detection algorithms, parameters are difficult to set, and expansibility is poor when used in large data sets. Aimed at these shortcomings, the thesis redefines the concept of outlier of uncertain data by adopting the idea of top-k and uses dynamic programming theory to calculate outlier probability. Furthermore, an efficient virtual grid-based optimization approach is also proposed to greatly improve our algorithm's efficiency. In order to improve the practicability of frequent itemset mining algorithm from uncertain data, a weighted frequent itemset mining algorithm from uncertain data based on CUFP-Mine is presented, it sets the weight value for each item of uncertain transaction dataset and improves the way of structing tree of CUFP-Mine algorithm, which effectively reduces the size of tree and improves mining efficiency.Finally, data mining system based on uncertain data is applied to VTS fault management subsystem as a module, and realizes fault alarm and fault analysis of VTS fault management subsystem. Results from practical operation show the methods proposed in this paper improve the performance and reliability of the system.
Keywords/Search Tags:Uncertain Data, Data Mining System, Outlier Detection, Dynamic Programming Theory, Virtual Grid, Frequent Itemset Mining
PDF Full Text Request
Related items