| With Secure Sockets Layer(SSL)and Transport Layer Security(TLS)Protocol playing an important role in network communication,more and more malicious software use SSL/TLS to communicate.As an key part of network security management,traffic identification an necessary step to detect attacks.Currently,the field of encrypted traffic lacks authoritative public labeled dataset of malicious encrypted traffic and marking methods,so it is difficult to achieve large-scale marking of encrypted traffic.At the same time,the SSL/TLS traffic generated by malicious software has the problem of unbalanced data classification.To solve these problems,this thesis focuses on the SSL/TLS encrypted traffic generated by malicious software,and the main research contents are as follows:1.To improve the problem of obtaining labeled malicious encrypted traffic data,this thesis proposes an encrypted traffic identification method based on self-supervised contrast learning,which decouples the feature extractor from the classifier.This method includes two stages:pre-training and fine-tuning.In pre-training,the feature extractor is expected to learn general features from a large number of unlabeled data,then in fine-tuning,this model can learn high-level features from a small number of labeled data.This method based on BYOL method,starts with the TCP load and the sequence of packet length,and uses statistical features as pretext to pretrain feature extractor.Then,the classifier are trained with labeled data.Finally,the validity of the method is proved by experiments.2.To improve the problem of data imbalance of malicious encrypted traffic data,this thesis proposed a detection method based on fine-tuning,introducing supervised contrast learning loss during fine-tuning to widen the distance between different categories.The results of experiments show that this method can improve the performance of the model effectively.3.This thesis design and implement a prototype system of SSL/TLS malicious encrypted traffic identification based on machine learning,include traffic capture,data processing,traffic detection and visualization. |