| As the Internet continues to expand,the scale of network is becoming increasingly large,and its topologies are also becoming more and more complex.Network traffic shows an explosive growth,among which the proportion of encrypted traffic and malicious traffic is constantly increasing.Therefore,it is necessary to perform detection analysis on network traffic to identify normal,encrypted and malicious traffic.With the continuous development of Internet technology,network traffic detection techniques are constantly updating.Traditional detection methods based on ports and loads are no longer able to meet the demands of network traffic detection in the current era of big data.The development of machine learning and deep learning technologies has opened new opportunities for us.Therefore,effective network traffic detection and analysis is of great significance in network security protection and research.In this paper,based on machine learning and deep learning,we conduct research on two aspects of network traffic detection: encrypted traffic detection and malicious traffic detection.And the main contributions are as follows.(1)To address the problem that the object of encrypted traffic detection is limited to encryption protocols such as SSH and SSL/TLS,which are not universal and cannot adapt to newly developed encryption protocols.This paper proposes a VPN encrypted traffic detection method based on machine learning.Firstly,data preprocessing is conducted using the CICFlow Meter tool,setting the flow duration at intervals of 15 s,30s,60 s and 120 s.Statistical information of the transport layer is extracted from the pcap packets in the public dataset VPN-NOVPN,and over 80 dimensions of network traffic statistical features were outputted.And then,the 23-dimensional time-related features are filtered out from them for the experiments.Meanwhile,two experimental scenarios are set up to implement the detection of encrypted traffic.Scenario 1 implements the detection of encrypted traffic and non-encrypted traffic,while scenario 2 implements the detection of the application itself and specific tasks associated with it.Finally,Decision Tree,Random Forest and Gradient Boosted Decision Tree(XGBoost)algorithms are used for experiments,and compared with C4.5 and KNN algorithms.The experimental results show that the method proposed in this paper can effectively detect encrypted traffic,and the accuracy of the model can be improved by using shorter flow timeout values.Among them,when the flow timeout value is 15 s,the XGBoost model performs best in different classification experiments,with accuracies of 0.920(brinary classifier),0.970(NOVPN-seven-classifier),0.910(VPN-seven-classifier),and 0.870(14-classifier),respectively.(2)To address the problem that traditional machine learning-based malicious traffic detection methods cannot automatically extract features,this paper proposes an improved convolutional neural network(CNN)-based malicious traffic detection method.Firstly,according to the traffic granularity and protocol level,we choose to represent the malicious traffic in the form of "Session + All Layers" and "Session + L7".Secondly,using the USTC-TK2016 tool,a total of 469,014 sample sets containing normal and malicious traffic were generated from public dataset of pcap packets by performing data segmentation,traffic cleaning,image generation and IDX format conversion.And then,deep extraction and classification of samples are performed through multiple convolution and pooling steps to achieve binary and multiclass classification of malicious traffic.Finally,the method proposed in this paper is experimentally validated,and the experimental results show that compared with other algorithms,such as the CNN_SInd RNN model,Siamese Neural Network model and CNN-SVM model,the CNN-based malicious traffic detection method has the highest overall accuracy in multi-classification. |