Font Size: a A A

Outlier Detection Algorithm Based On Deviation Fluctuation Difference And Mutual Neighbor Weighting Factor

Posted on:2024-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:S LiFull Text:PDF
GTID:2568307151967499Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Outlier detection technology is an important part of data mining,which has been applied in fields such as fraud detection,image processing,and medical diagnosis.Many domestic and foreign scholars have conducted extensive research on outlier detection technology.This paper mainly focuses on the various problems in algorithms based on nearest neighbor relationships and local outlier factors,and the content is as follows.Firstly,a new outlier detection approach based on deviation fluctuation difference is proposed to address the issue of unstable detection accuracy in proximity-based outlier detection algorithms due to varying values of surrounding parameters.Starting with variations in direction and angle between objects,this method eliminates distance and density and incorporates skewness through the idea of vectors to characterize the degree of departure of data objects in their vicinity.Following that,a volatility factor is introduced by calculating the degree of volatility of data points using the deviation fluctuation difference of distinct places.Then,the combination of skewness and volatility factor is used as a volatility outlier factor to detect outliers,effectively reducing the proposed algorithm’s sensitivity to nearest neighbor parameters and ensuring high detection accuracy;and the proposed algorithm’s correctness and time complexity were evaluated.Second,a new outlier detection method based on mutual neighbor weighting factor is proposed to address the issue that using the nearest neighbor parameter k in the outlier detection algorithm with local outlier factors cannot depict the information of data objects comprehensively,resulting in a decrease in accuracy.This technique gives weights to define weighting factors and sets a different number of mutual neighbors for each data object.It also uses the definition of local anomaly factors to characterize the outlier score of the data object.Using the change rate of various point weighting factors to compare with the threshold value,points larger than the threshold value are regarded as outliers,effectively improving detection accuracy;and analyzed the correctness and time complexity of the detection algorithm.Finally,this paper selects similar outlier detection algorithms,employs three evaluation indicators for experimental comparison under artificial and real data sets,and verifies the efficacy of the algorithm suggested in this paper.
Keywords/Search Tags:outliers, skewness, volatility factor, mutual neighbor, weighting factor
PDF Full Text Request
Related items