Font Size: a A A

Data-Driven Based Encrypted Traffic Protocol And Program Identification

Posted on:2023-03-06Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhangFull Text:PDF
GTID:2558307100470224Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of network technology and people’s increasing awareness of network privacy protection,the proportion of encrypted traffic in current network traffic is increasing.Increasingly,applications use encryption protocols to prevent plaintext transmission of network content and network address hiding.While network encryption technology effectively protects users’ network security and privacy,it also brings many challenges to network regulation.For example,some unscrupulous elements use encryption protocols to encrypt data transmission and engage in illegal network behaviors.Therefore,the research on the identification of network encrypted traffic is of great significance to the effective monitoring and management of the network.Although intelligent methods represented by deep learning have been widely used in the research of encrypted traffic identification,existing research still has problems such as uneven traffic data distribution in distributed environments,and low accuracy of application program identification caused by layer by layer encryption of The onion router(Tor)protocol.The main work carried out in this thesis to address the above issues is as follows.(1)To address the problem that the accuracy of the encrypted traffic identification model is reduced due to the uneven distribution of data in the distributed environment,this thesis proposes a data balancing federal learning framework based on autoencoder.In this framework,firstly,the client participating in the federation learning model training first trains the autoencoder model;then,the trained autoencoder performs data augmentation on a small amount of local data to achieve the purpose of balancing the data set;finally,the client uses the balanced data set to train the federation model again.The experimental results show that the federation learning framework proposed in this thesis improves the recognition accuracy of encrypted traffic protocols by 9% compared with the existing federation learning training schemes.(2)Aiming at the problem of low application recognition accuracy caused by the Tor encryption protocol’s use of multiple layers of encryption policies,this thesis proposes a scheme for classification of Tor programs based on time-cumulative features.In this scheme,the original traffic is firstly processed to generate a kind of time sequence describing the fluctuation of traffic data,and the generated time sequence is used as the original information of the encrypted traffic in order to better describe the temporal characteristics of the traffic;then,a classification model based on gated recurrent unit neural network is designed,and the processed time sequence is used as the input information of the model to obtain the potential of different applications of Tor.The processed time sequences are used as the input information of the model to obtain the potential temporal features of different applications of Tor,and then implement Tor for program-level classification.The experimental results show that the proposed scheme achieves 95% identification accuracy in the Tor program identification task.
Keywords/Search Tags:Encrypted traffic identification, Tor, Deep learning, Federated learning
PDF Full Text Request
Related items