Font Size: a A A

Research On Internet Traffic Classification And Anomalv Detection Methods

Posted on:2021-01-12Degree:MasterType:Thesis
Country:ChinaCandidate:R WangFull Text:PDF
GTID:2518306308475234Subject:Information security
Abstract/Summary:PDF Full Text Request
With the continuous expansion of the network scale and the continuous development of various network applications,network activities are increasing.Network traffic classification has important functions.Such as quality service control,network bandwidth resource management,and intrusion detection.At present,the main methods for solving network traffic classification are based on traditional machine learning methods and deep learning methods.However,the existing methods have many shortcomings,such as the difficulty of manually extracting effective features,and the accuracy rate of traffic recognition needs to be improved This thesis has several problems in network traffic classification.The main work is as follows:1.In order to further improve the classification ability,this thesis proposes a method that combines domain knowledge of network traffic with deep learning.A three-layer neural network model is designed,with a linear layer,an auto-cross layer,and a convolutional layer.The linear layer and the auto-cross layer deal with the domain features,the convolutional layer deals with the higher-order features,and uses multi-class AUC as the evaluation index.Experiments are performed on two data sets,and the validity of the new model is verified by comparison.2.There are many imbalance problems in the network traffic data.If not resolved,the classification effect will be biased,and the prediction results will be divided into most categories.This article uses the method of generating adversarial networks to solve the problem of imbalance in network traffic data at the data level.Through the generation of adversarial network technology,a few samples are generated,and then the K-nearest neighbor algorithm is used to calculate the distance between the generated data and the few samples The closer it is,the more similar the generated data is to the original sample.Select n samples that are closer together to mix with the original sample,and then build a classification model.Experiments show that this method has improved F1 value and AUC compared with SMOTE,RUS and other methods.3.Network traffic classification belongs to supervised learning.It can only learn from traffic data with known labels.It cannot accurately classify unknown traffic data because unknown traffic belongs to a completely new type of traffic.This paper proposes to use the word2vec pre-training method to learn the word embedding of the payload information in the original traffic to characterize each piece of traffic data and avoid complicated artificial feature engineering.It is found through experiments that when unknown traffic is introduced,the information entropy of the predicted probability matrix is larger than when unknown traffic is not added.In the experiment,by setting the threshold of information entropy,unknown traffic can be effectively identified,and the accuracy of classification is improved.
Keywords/Search Tags:Network traffic classification, Domain feature, Generative adversarial network, Unbalanced data, Word2vec, Unknown network traffic
PDF Full Text Request
Related items