Font Size: a A A

Analysis And Research Of Outlier Detecting Algorithm Based On Ensemble Methods

Posted on:2021-06-06Degree:MasterType:Thesis
Country:ChinaCandidate:X D KouFull Text:PDF
GTID:2518306470462794Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Outlier detection is the process of finding the data points whose behavior is different from the expected object through a variety of detection methods.Outlier detection technology has been successfully applied in financial system risk control,medical disease diagnosis,forest disaster monitoring,network attack early warning detection and other fields.Through the continuous efforts of researchers,there have been a variety of outlier detection algorithms,among which the most classical outlier detection algorithms are based on distance,density and clustering.Currently,these outlier detection methods have been effectively used for several decades.Most of them are based on a feature of outliers,such as density,distance,etc.,and the detection effect is not very good.How to improve the detection effect of outliers has become a problem that scholars have been scrambling to solve.To solve related problems,the specific work of this paper is as follows:(1)A method of outlier detection named Cumulative Agreement Rates Ensemble(CARE)was proposed.To improve the accuracy of the whole ensemble detection model,this algorithm ensembles several different weak classifiers into one model and weighs bias-variance.In this algorithm,outlier detection is considered to be a binary classification task with unobserved labels and can be decomposed by bias-variance error.Outliers existing ensemble method contains only a parallel framework,and combining the independent basic detector obtain results to reduce the variance,the deviation is higher,and the CARE combined with parallel ensemble and sequence ensemble,it can reduce variance or deviation,effect detection is better than using an integrated framework of the algorithm.Compared with the current best outlier integration methods,CARE has a higher accuracy rate,while remaining close to these excellent detection methods in other aspects.(2)A new ensemble method for unsupervised outlier detection,DCSO is proposed.The basic detectors of the traditional static ensemble detection method are all fixed,and the detection accuracy of the model is only improved by changing the weight of each basic detector.However,for those outliers that are seriously off-center,the detection effect is poor.Therefore,by evaluating the capability of the basic detector in it defined local neighborhood,DCSO dynamically identifies the basic detector with the best performance for each test instance,ensembles the recognized excellent basic detector,and outputs the detection results of the entire model.The traditional static ensemble detection method ignores the background truth of the neighborhood of outliers,while the background truth of the local neighborhood of data has a great impact on the detection ensemble of outliers.Therefore,DCSO sorts the capability of a single basic detector by the similarity with the background truth of the local neighborhood to reduce the impact of the background truth on the model.
Keywords/Search Tags:outlier, data mining, outlier detection, ensemble method
PDF Full Text Request
Related items