Font Size: a A A

Research On Lightweight Traffic Classification Based On Knowledge Distillation

Posted on:2022-11-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y H WuFull Text:PDF
GTID:2518306758491564Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Network traffic classification plays an important role in traffic management,troubleshooting,network security and other fields.The traditional network traffic classification methods have low accuracy,poor timeliness,and require extremely high operation and maintenance costs.With the success of deep learning in CV and NLP in recent years,researchers try to use deep learning to build an end-to-end network traffic classification model.This method has high accuracy,good generalization performance and does not need manual design features,which greatly promotes the development of network traffic classification.However,the current deep learning models have a large amount of parameter redundancy,which cost a lot of computing and storage resources,and can not meet the growing needs of network traffic classification.Knowledge distillation is a powerful tool to solve this problem.The intra-class and inter-class information carried by soft labels can better guide the training process of the model,so as to achieve model compression.Based on this background,this paper proposes three lightweight network traffic classification models based on isomerization offline distillation,gradual selfdistillation and comparative learning self-distillation.In order to better learn the timing information of network traffic,this paper uses LSTM as the basic network structure.The main contributions of this paper are as follows:Lightweight network traffic classification model based on isomerization offline distillation on the basis of training the teacher model with numerous parameters,the logits generated by the teacher model are used as soft labels with rich intra-class and inter-class information,and the self-adaptive temperature softening the soft labels which is used to guide the training process of the student model.In this way,knowledge transfer and model compression are completed.And focalloss is used to overcome the imbalance of samples and different difficulties in learning.The experimental results show that the accuracy of the student model is only decreased by 0.45% when the recognition speed is increased by 72%,which effectively improves the reasoning speed of the model while ensuring the accuracy.The lightweight network traffic classification model based on gradual self-distillation uses the model with the best classification effect in the early stage of training as the teacher model to guide the training process of the model in the later stage of training.In the early stage of training,the accuracy of teacher model is low,and its knowledge characteristics are not reliable.With the improvement of accuracy,the teacher model becomes more and more reliable,and a higher weight should be assigned at this time.In this paper,an adaptive weight function is designed,which can greatly improve the confidence of the student model to the knowledge learned and accelerate the convergence speed of the model.The trial results show that the model prepared in this manner has more grounded learning capacity and preferred order impact over the model prepared straightforwardly without distillation.Recall increased by 2.54% and F1 by 2.16%,and F1 of P2 P and VPN-Streaming reached 100%.The lightweight network traffic classification model based on comparative learning selfdistillation adopts the way of randomly erasing data to enhance the data,and the difference between the original sample and the new sample is used as soft labels to participate in the training process of the model.This method can prevent the model from relying too much on local features and force the model to mine more feature information,so as to improve the generalization effect of the model.The experimental results show that in this way,the contribution of most classes and minority classes to the model is the same,and the model can learn more abundant and comprehensive feature information,which can effectively solve the problem of sample imbalance and difficult features difficult to learn.Precision is the most obvious,increased by 3.06%,while F1 increased by 2.30%.Through experimental comparison and analysis,the lightweight network traffic classification model based on isomerization offline distillation has the best classification effect,and the accuracy is 99.52%.The teacher model with more parameters can learn the sample characteristic information better.The isomerization offline distillation algorithm proposed in this paper can effectively transfer knowledge to the student model.
Keywords/Search Tags:Network Traffic Classification, Deep Learning, Isomerization Offline Knowledge Distillation, Gradual self-Distillation, Comparative Learning self-Distillation
PDF Full Text Request
Related items