Font Size: a A A

Research On Unsupervised Anomaly Detection Algorithm And Application

Posted on:2019-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:X LiuFull Text:PDF
GTID:2348330563453951Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
According to Hawkins' definition of the anomaly,“The anomaly is the data that is different from the other observations and produced by different mechanisms.”.Therefore,anomaly detection is the detection and discovery of abnormal data patterns in data that are different from normal behavior.In the network environment,complex machine systems,weather information systems and other systems,the conditions in these systems can often be summed up in two forms,one is normal,another is abnormal.In these systems,the patterns of abnormal often contain important information.We can sometimes avoid catastrophic risks,improving work efficiency and so on if we has detected patterns of abnormal early.At present,anomaly detection technology has been widely used in various fields,such as network intrusion detection,credit card fraud detection,fault detection and repair in complex system and meteorological anomaly detection.Identifying,understanding and predicting anomalies in data has become one of the key pillars of modern data mining.In the context of big data,more attention has been paid to how we can quickly obtain valuable information from the overall data.Therefore,the unsupervised anomaly detection algorithm needs to pay more attention to the abnormal discovery efficiency and the adaptability of the algorithm under various data conditions.This paper aims to propose an unsupervised decision tree based algorithm for anomaly detection by studying and analyzing the advantages and disadvantages of previous unsupervised anomaly detection algorithms.The main research content of this article includes the following three parts:1.Comparison of anomaly detection algorithms at home and abroad.This part is the logical starting point for the study of this topic.Through the background research and analysis of the anomaly detection algorithm in this part,we can find the advantages and disadvantages of the existing algorithm,so as to extract the existing algorithm and extract a new algorithm.2.Unsupervised anomaly detection algorithm proposed and experimented.The study of this part is based on the survey of the first part.We propose a new unsupervised decision tree algorithm that combines statistical knowledge and decision tree construction methods.This method uses the distribution information of the data on the feature to find the optimal split point that split the data set into two parts in the branch node of the decision tree.The experimental results show that this method can obtain better detection ability than the existing general methods.At the same time,this method is more adaptable to data than existing methods.3.The decision tree acceleration algorithm is proposed and experimented.In order to make the decision tree algorithm still have higher execution efficiency under a large sample,a gradient-based decision tree algorithm is proposed to find the optimal segmentation point.This method uses separable gradient information to guide the calculation of separability at skipping unimportant segmentation points,reducing a large amount of computation.Experiments show that this method does not reduce the accuracy of the algorithm while reducing the amount of computation.
Keywords/Search Tags:anomaly detection, statistical learning, decision tree algorithm, decision tree acceleration algorithm
PDF Full Text Request
Related items