Outlier Detection Algorithm Based On Kernel Function And Density Peak

Posted on:2024-03-12

Degree:Master

Type:Thesis

Country:China

Candidate:M X Wei

Full Text:PDF

GTID:2568307151967679

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Outlier detection is a task with a long history,and large amounts of data are constantly generated in modern production and life.Not all data is valuable,it is necessary to extract some useful information from massive amounts of data.This has led to the increasingly widespread application of outlier detection,and many different types of outlier detection techniques have emerged.Scholars studied the concept of outlier detection and generally defined it as: if the attribute value of a data object is significantly different from that of other data objects in a given data set,then this data object is called an "outlier".This article focuses on improving the detection accuracy of density-based and cluster-based outlier detection algorithms.The main research contents are as follows.First,this paper proposes an outlier detection algorithm based on kernel local density estimation to address the issue of low detection accuracy of local outliers in datasets with uneven density distribution and irregular shape,as well as the sensitivity of many distance-and density-based outlier detection algorithms in setting the parameter k value.This algorithm uses the natural neighbor search algorithm to automatically adjust the k parameter,estimates the local density of data objects by considering the neighborhood information of objects in the Gaussian kernel density estimation,and then defines the concept of the k object average distance to characterize the distribution around data objects.By combining the local density of data objects and the k object average distance,the algorithm proposes the local deviation factor to detect local outliers more accurately.Secondly,to address the issue of data sets with large density differences and distant distances between subclusters and clusters,this paper proposes an outlier detection algorithm based on K-Medoids clustering and density peaks.This algorithm uses the contour coefficient method to determine the optimal number of clusters for each data set,the Max Min algorithm to determine the starting point of the cluster,and the K-Medoids clustering algorithm to divide the data set into multiple clusters.Then,the cut-off distance is redefined,using the idea of density peak clustering and the combination of the two indicators,and then the concept of cluster object deviation degree is proposed to detect outliers,so as to find the outliers in the cluster more accurately.Finally,this paper selects the most widely used and effective outlier detection algorithms in recent years and performs experimental comparisons and analyses with the proposed algorithm on artificial and real datasets to validate its efficacy.

Keywords/Search Tags:

data mining, outlier detection, kernel density estimation, K-Medoids clustering, density peak clustering

PDF Full Text Request

Related items

1	Outlier Detection Algorithm Based On Entropy Weight Distance And Density Peak Clustering
2	The Outliuer Detection Algorithm Based On K-kernel Space And K-medoids Clustering
3	The Outlier Detection Algorithm Based On Adaptive Clustering And Gaussian Kernel Density
4	Research And Application Of Density Peak Clustering Algorithm Based On Spark Framework
5	Research On The Grid Density Peak Clustering Algorithm
6	Research On Density Peak Clustering And Its Application In Community Detection
7	Research On Hierarchical Clustering Algorithm Based On Density Peaks
8	Study On Density Kernel Clustering And Outlier Detection Algorithm Based On Skewness
9	Study On Clustering For Large Data Sets And Its Applications
10	Research Of Clustering Algorithm Based On Density Peak