Font Size: a A A

Research Of Detection Outlier Based On Outlier Degree

Posted on:2012-12-06Degree:MasterType:Thesis
Country:ChinaCandidate:S J LiuFull Text:PDF
GTID:2178330335989533Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years outlier detection has become hot spot in data mining. The purpose of outlier detection is to extend the people senses and to find the knowledge and important mode which cannot be found obvious. These knowledge and mode may have more value for us. Therefore, the research for outlier detection is significant.First of all, according to the description way of object attribute value, we divide the object's attribetes into numeric attributes and the non-numeric attributes and propose the processing method to the non-numeric attributes. In order to reduce the calculation, we propose a new method to calculate the approximate distance between objects. According to the analysis of object approximate distance, we get the approximate connectivity which considered the pruning strategy to narrow candidate set.According to the clustering results, we use the algorithm based on approximate outlier factor when the clustering results is ideal. This algorithm use the rough outlier set which processed by clustering algorithm to divide the data set into suspicious outlier set and clustering set. In order to simplify the computational, we introduce clustering attribute values object to replace clusters when we calculate the distance between the object and the clusters.While the clustering results is bad,we use the algorithm based on reference outlier degree. This algorithm is based on chebyshev inequality theory and devided the data set to get suspicious outlier set. According to the given reference point,we get the reference distance which as a standard to judge whether the object is outlier.Finally, the algorithm based on approximate outlier degree and the algorithm based on reference outlier degree is carried out in simulation experiment. Experimental results show the two algorithm is effective and high accuracy.
Keywords/Search Tags:data mining, outlier detection, pruning strategy, outlier degree
PDF Full Text Request
Related items