Font Size: a A A

Research And Implementation Of Semi-supervised Anomaly Detection System

Posted on:2022-06-22Degree:MasterType:Thesis
Country:ChinaCandidate:C XiongFull Text:PDF
GTID:2518306506496294Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,with the coming of the Internet era,the data level metric has increased explosively and the big data era is coming.New computing frameworks have emerged because of the traditional frameworks fail to deal with the calculation of massive data,which provide great support to the parallel computing of data and lay a foundation for all fields of data science and accelerate the development of various fields.However,with the increasing amount of data,outlier data and malicious attacks are becoming greater in number.In many cases,the outlier data may cause great harm to the system,so,the anomaly detection is becoming more and more important.The ways to anomaly detection also attract most people' s attention.In terms of outlier data,which can be regarded as the unexpected data that need to be identified.Because the objects identified by anomaly detection are high-dimensional sample features,machine learning algorithm should be considered first in this case.In the actual application scenario,the detected data can be classified into labeled data and unlabeled data.It is very difficult to obtain a large mount of labeled data,which takes a lot of resources to complete,while it is much easier to obtain a small amount of labeled data.However,traditional machine learning algorithms don' t not make good use of a large number of unlabeled data,only use a small number of labeled data,so it doesn' t work well in practice.According to whether the data is labeled in the training process,machine learning algorithms can be divided into unsupervised learning algorithm,supervised learning algorithm and semi-supervised learning algorithm.The supervised algorithm requires that the trained data is labeled,but it is difficult to achieve.Although the unsupervised algorithm needs no labeled data,its overall performance is not as good as the supervised one,and the semi-supervised algorithm between the supervised and unsupervised algorithm combines the advantages of the two kinds of learning,and gives full play to their advantages.Hence,this paper adopts a semisupervised model Deep SAD proposed in recent years,and combines the classification function of linear discriminant analysis(LDA)model to preprocess some unlabeled data for anomaly detection.Deep SAD model is developed on the basis of Deep SVVD,which can make fair use of labeled outliner data and normal data.The main works of this paper are shown as below:1.Combining the linear discriminant analysis model with Deep SAD,the classification function of linear analysis discriminant model is used to generate approximate labels for unlabeled data for the first step,then,the objective function in the Deep SAD model is adopted to improve the effect of anomaly detection.At the same time,adjusting the important parameters of Deep SAD model to get the best parameters,and the Deep SAD model combined with linear discriminant analysis model is compared with other known models to integratedly test the effect of the model.2.The development of anomaly detection system based on Deep SAD model combined with linear discriminant analysis,which develops an anomaly detection system based on Spring Boot framework in combination of Deep SAD model with linear discriminant analysis.The system provides the interface of manual labeling data,when the system generating a large amount of unlabeled data,the analysts can use the interface to manually label some data,and jointly input the unlabeled and manually labeled data into the anomaly detection module for anomaly detection and archive the test results for anomaly tracking later.
Keywords/Search Tags:classification, semi-supervised, anomaly detection, deep learning, support vector data description
PDF Full Text Request
Related items