Font Size: a A A

The Outliuer Mingng Algorithm Based On Gaussian Kernel Function And Local Density

Posted on:2019-02-08Degree:MasterType:Thesis
Country:ChinaCandidate:C Y MiaoFull Text:PDF
GTID:2428330566989138Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Outlier mining is one of the important branches in data mining.In recent years,outlier mining has received extensive attention and research from scholars.In everyday life,there are a few anomalies that are easily overlooked,but they often have more value.Outlier mining is a data mining method that can help people accurately and quickly acquire information with significant anomalies in complex information.At present,research scholars at home and abroad have proposed many methods for mining outliers.This dissertation focuses on the problem of poor performance of local outlier mining algorithms,and conducts in depth research on local outlier mining algorithms.The work of this article mainly includes the following three aspects:Firstly,the research background and significance of outlier mining technology are analyzed,and the research status at home and abroad is analyzed deeply.The process and performance of classical local outlier mining algorithms are analyzed and studied in detail.Secondly,aiming at the problem that the density based outlier mining is poor,three types of nearest neighbor,including close neighbor,reverse nearest neighbor and shared nearest neighbor,are used to estimate the neighborhood density of the data object by using the kernel density estimation method with the kernel function of Gauss kernel,and a kind of Gauss kernel function based on the kernel function is proposed.Local outlier mining algorithm.At the same time,the accuracy and efficiency of the algorithm are analyzed.Thirdly,aiming at the problem of high time complexity and low accuracy in INFLO algorithm,the core impact point set is introduced to reduce the inverse nearest neighbor calculation for unnecessary data points.At the same time,the core neighbor is used to estimate the density of the data points after the processing,and an outlier mining algorithm based on local density is proposed.The accuracy and efficiency of the algorithm are also analyzed.Finally,on the UCI real data set and the synthetic data set,the two algorithms proposed in this paper are implemented to excavate the outliers,and the experiments are compared with the LOF algorithm and the INFLO algorithm respectively.Theeffectiveness of the two algorithms is verified by experiments.
Keywords/Search Tags:data mining, local outliers, nuclear density, Gaussian kernel function, shared nearest neighbors, core impact point set
PDF Full Text Request
Related items