Font Size: a A A

Research On Anomaly Detection Based On Web Session Sequence

Posted on:2020-07-27Degree:MasterType:Thesis
Country:ChinaCandidate:J W SuFull Text:PDF
GTID:2428330620956747Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of Web technology,a large amount of data is generated all the time in the Internet.How to extract valuable information and rules from these data are very important tasks,but the abnormal data affects the quality of data mining.Abnormal data often represents a possible security threat,so it also has the value to be explored.In order to improve the security of application systems and the quality of data mining,how to extract abnormal information from a large amount of data is an urgent problem to be solved.In the offline environment,the existing anomaly detection technology is difficult to play an effective role in real-world unlabeled real data.In an online environment,web data is constantly being updated,and existing technologies may be difficult to detect and locate anomalous data in time.Therefore,building a complete and efficient anomaly detection model is of great significance and value for improving data mining quality and system security.In view of the above problems,this paper uses a variety of machine learning methods to build a complete anomaly detection scheme.The solution in this paper mainly includes the following points:(1)A novel anomaly detection algorithm with the session feature similarity is proposed: SFAD.The method constructs the similarity matrix for the Web sequence data,and separates the suspicious users according to the fuzzy clustering ? cutting method.Finally,the detection and positioning of the abnormal users are detected from the suspicious users through the detection of multiple sliding windows.(2)Anomaly detection algorithm based on Minimum Hash and LSTM(MH-LSTM)is proposed.The method uses Min-Hash method to propose data features from Web sequence data,and puts the extracted features into the designed LSTM network for training.The online data to be detected is segmented by sliding the window,and the trained data is detected by the trained LSTM network for the data in the window to detect and locate the abnormal data.(3)A complete and practical anomaly detection model is proposed.In the online environment,the problem of data distribution changes easily leads to a decrease in the accuracy of the online model.In this paper,the offline model and the online model are combined,and the model is updated by incremental update.The offline model detects data distribution for online model updates to address changes in data distribution that may occur.Experiments show that the detection rate of the offline anomaly detection model proposed in this paper is about 98%,and the detection rate of the online anomaly detection model is about 97%.The complete anomaly detection model can effectively deal with the distribution problem and guarantee a certain anomaly detection rate.
Keywords/Search Tags:anomaly detection, abnormal location, Web session sequence, data similarity
PDF Full Text Request
Related items