Font Size: a A A

Design And Implementation Of Anomaly Traffic Detection System Based On Machine Learning

Posted on:2018-10-01Degree:MasterType:Thesis
Country:ChinaCandidate:S S HeFull Text:PDF
GTID:2348330518996862Subject:Information security
Abstract/Summary:PDF Full Text Request
Nowadays, with the continuous development of Internet technology,people lives and work more and more dependent on a variety of Internet applications. However, due to the increasing sophistication,diversification and automatization of network attack tricks, many network applications have suffered a variety of network attacks and security threats, exposing a lot of network security vulnerabilities. Abnormal traffic detection as the first step of attack defense provides an effective guarantee for the interception of attacks. Therefore, accurate detection of abnormal traffic is necessary to ensure the availability and security of network applications.Based on the characters of network traffic analysis, anomaly traffic detection model based machine learning is proposed. The model mainly contains four parts: 1) analyzing abnormal traffic and create the malicious keywords library and multidimensional feature library by using data mining; 2) testing the effectiveness and making optimization of multidimensional feature library; 3) selecting the appropriate machine learning algorithm to learn and verify the training set, then evaluate the performance of the classification results; 4) in the practical application of the system, we deployed it on Hadoop and Spark cloud platform, which could improve the efficiency of abnormal traffic detection by parallel detecting.In the analysis of abnormal flow characteristics, we regarded anomaly detection as a pattern recognition problem and combined rules with statistical to analysis features. Moreover, we extracted the commonness of abnormal traffic and the difference between abnormal traffic and normal traffic then summarized as feature set, which is verified and evaluated by machine learning algorithm.In the research of feature optimization, this paper proposes three algorithms: a feature selection algorithm based on Sigmoid, a feature ranking algorithm based on information gain and a feature optimization algorithms based on time feedback. These three algorithms could dig out the best feature subset from the multi-dimensional feature set by filtering,sorting and performance optimization.In the choice of machine learning algorithm, the decision tree,random forest and GBDT classifiers are used to validate the accuracy and efficiency of this method. Finally, the experimental results demonstrate that GBDT has a good detection performance.Finally, considering the actual application system in the big data environment, we designed and implemented a distributed detection system based on the Hadoop and Spark distributed platform with the cloud storage. The completely parallelization of data preprocessing,feature analysis and machine learning greatly improve the detection the efficiency of the system.
Keywords/Search Tags:network traffic analysis, anomaly detection, feature extraction, machine learning, distributed deployment
PDF Full Text Request
Related items