Font Size: a A A

The Outliuer Detection Algorithm Based On K-kernel Space And K-medoids Clustering

Posted on:2020-09-03Degree:MasterType:Thesis
Country:ChinaCandidate:C LiuFull Text:PDF
GTID:2428330599960347Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Outliers detection can detect a small amount of data containing valuable information in a large amount of data.Outliers detection has a wide range of applications in real life,so outliers detection has become a hot topic in data mining.The main task of outliers detection is to detect abnormal data and obtain valuable information in these abnormal data.Density-based outliers detection and cluster-based outliers detection are hot topics.The related methods of outliers detection are deeply analyzed in this paper.In order to improve the efficiency of outliers detection,some strategies are proposed for the poor mining performance for the detection methods of density-based and cluster-based.The main content of this article is divided into the following sections.Firstly,In this paper we proposes a fast local outlier detection algorithm based on K kernel space.The algorithm solves the problems that the detection efficiency is not high due to the uneven density distribution in the density-based outlier detection algorithm and the running time of the algorithm increases significantly after the introduction of the inverse k neighborhoods.The algorithm divides the datasets into near k neighbors points and far k neighbors points by k kernel space.By doing this we can reduce the number of points that need to calculate the inverse k neighborhoods,thus reducing the running time of the algorithm.By introducing reachable distance and reachable density to reduce distance statistical fluctuations.Secondly,in view of the problem that the existing algorithm has low detection efficiency for data sets with multiple clusters and different density between clusters and clusters,and the distance between them is far apart,according to the attribute of outliers—,the outliers are farther away from the denser point,and the density of the outliers is lower than the density in their neighbors an outlier detection algorithm based on K-medoids clustering is proposed.The initial center point of the cluster is selected by the MaxMin algorithm to find all outliers in each cluster more accurately.Finally,by verifying the algorithm in the real data sest and the virtual data sets,the experimental results are compared with the existing algorithms,and the effectiveness ofthe two proposed algorithms are verified.
Keywords/Search Tags:data mining, outlier detection, density, k kernel space, clustering, MaxMin algorithm
PDF Full Text Request
Related items