Font Size: a A A

Outlier Detection Based On Distance And Information Entropy Uncertainty

Posted on:2012-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:J W YangFull Text:PDF
GTID:2218330338455718Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
In recent years, the technologies of collecting and processing data result in enormous amounts of inconsistent or missing data, such data is often remodeled in the form of uncertain data. The emergence of uncertain data puts forward new challenges for the traditional data mining technologies. As an important task of data mining, outlier detection causes more and more attention. But in the traditional outlier detection algorithms, data is usually certain, or without considering the natural structure, which leads to great discrepancy as actual. Therefore, it is very important to detect outliers over uncertain data. The research topic of detecting outliers from uncertain data is chosen to study in this paper.Firstly, we introduce the related concepts and reasons of outlier detection, and several outlier detection methods. Then the management of uncertain data and some commonly used mathematical theories for dealing with uncertain data are introduced. Also the continuous numeric uncertain data is stated briefly.Secondly, we extend the traditional outlier detection to uncertain data, and define the related concepts of outlier detection over uncertain data. Also a distance based outlier detection over uncertain data is designed.Thirdly, we proposed an entropy-based pruning algorithm for the problem of high time complexity. An example is used to demonstrate the reasonableness of pruning algorithm. Then we analyze the time complexity of pruning algorithm.Fourthly, through experiments of simulation data, we verify the effects that the parameters have on the pruning algorithm. Then through the contrast to the original algorithm over real data, we verify the efficiency and effectiveness of the pruning algorithm.
Keywords/Search Tags:uncertain data, outlier detection, pruning
PDF Full Text Request
Related items