| The accurate categorization of network traffic is a crucial aspect of network management and security.While traffic encryption protects user privacy,it also presents a challenge for traffic classification and management.Machine learning,particularly deep learning methods,has proven to be effective for large and complex encrypted traffic classification as they can automatically learn nonlinear features.However,deep learning-based classification models are highly sensitive to the accuracy of the labels in a dataset.Labels in a dataset that is inaccurate are known as noisy labels.This is also true for encrypted traffic datasets,which can produce noisy labels due to data collection errors,data labeling errors,malicious attacks,and other factors.Although traditional deep neural networks can be trained on any dataset in theory,when trained on datasets with label noise,the model memorizes the noise and leads to overfitting of the noisy labels.Therefore,obtaining accurate classification models through noise-tolerant training on cryptographic traffic datasets that contain label noise is a crucial and practical challenge in the field of cryptographic traffic classification.This thesis aims to address this issue by visualizing the original network traffic data,simulating the real environment to introduce label noise,and applying noise tolerance learning methods to encrypted traffic through symmetric learning and active-passive learning.The main objective of this approach is to enhance the accuracy and robustness of the model in the presence of label noise.The main contributions of this study are as follows.1.A new noise-tolerant learning method for encrypted traffic classification is proposed in this thesis,which utilizes a symmetric learning-based approach to enhance the model’s accuracy and robustness in the presence of label noise.The proposed method uses smaller dimensional residual blocks to alleviate the problem of gradient dissipation,thereby enabling deeper mining of traffic feature information.Additionally,the approach combines the cross-entropy loss function and the inverse cross-entropy loss function using a symmetric learning approach to reduce the overfitting of simple classes by increasing the cross-entropy penalty and reduce the underfitting of difficult classes by increasing the inverse cross-entropy penalty.This method can effectively learn under different noise levels and performs better than existing methods.2.The symmetric learning-based noise-tolerant learning method for encrypted traffic heavily relies on the noise type and intensity when it comes to the robust loss function.To address this,active and passive learning concepts are introduced to generalize the effect of symmetric learning in encrypted network traffic datasets with noise.The active function is used for classification tasks while the passive function is used for noise tolerance learning,and the two are combined.Normalizing the loss function enhances the efficiency of the gradient descent algorithm,avoids numerical overflow and instability,and solves the underfitting problem of the current model,thereby improving its classification accuracy in noisy datasets. |