Font Size: a A A

Research Of The Binary Protocol Format Reverse Technology Based On Network Traffic

Posted on:2022-01-23Degree:MasterType:Thesis
Country:ChinaCandidate:Z J YangFull Text:PDF
GTID:2518306731997909Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
With the increasing number of network applications,network protocols tend to be complex and private.A considerable number of network applications communicate with private protocols,which have not public protocol specifications.Private protocol plays an important role in protecting user privacy and intellectual property rights,but it also brings severe challenges to network security.It is an essential step to obtain network protocol specifications in network security analysis,such as vulnerability research,penetration testing,attack detection and malware analysis.To satisfy the needs of security analysts for protocol specifications,more and more studies use protocol reverse engineering to analyze private protocols.With the increasing complexity and diversity of private protocols in real networks,how to realize automatic and efficient reverse analysis has become the main research content of protocol reverse engineering.Protocol format reverse is an important part of protocol reverse engineering.Protocol state machine inference and subsequent application of protocol reverse engineering highly depend on the inference quality of protocol format.According to the use of character encoding,the protocol type can be mainly divided into binary protocol and text protocol.The lack of character encoding and delimiter makes the format reverse of binary protocol become the priority and difficulty of protocol format reverse at present.Focusing on the binary protocol format reverse,this paper uses the technology based on network traffic analysis to solve the problems of unknown traffic clustering,message type identification and message field segmentation.The main contributions are summarized as follows:1.Aiming at the problem of unknown traffic clustering of binary protocol,an unknown traffic clustering method based on transfer learning is proposed.Firstly,the feature dimension is reduced by convolution auto-encoder.Then,the Deep Adaptation Networks model is constructed based on transfer learning theory to improve the quality of clustering feature generation of unknown traffic.Finally,the Canopy algorithm used to determine the cluster number is improved and the K-means algorithm is used for clustering.The experimental results show that the feature generated by the Deep Adaptation Networks model can achieve the highest clustering purity of 99.63%,which is better than the feature generated by the convolution auto-encoder model and original feature.The mean and median of the number of center points calculated by the improved Canopy algorithm are closer to the real number of categories.In addition,the concentration of the number distribution of center points calculated by the improved Canopy algorithm is also better.This paper uses the method of transfer learning to learn the deep transferable features from the known traffic classification for unknown traffic clustering and improves the method of determining the number of clusters before clustering,which provides a new idea for traffic clustering of private protocols.2.Aiming at the problem of message type identification of binary protocol,a message type identification method based on data mining is proposed.Firstly,each message is represented as a n-gram sequence.Secondly,the Key Continuous Sequence Pattern algorithm is designed to quickly generate position-related candidate key fields.Then,the keyword probability constraint relation is constructed and the probability of each candidate field is calculated based on the factor graph model.Finally,the candidate field with the highest probability is selected as the keyword to distinguish the message type.The experimental results show that the proposed method can achieve100% V-measure in message type identification of various protocols,which is almost the same as the state-of-art technology named NETPLIER and much higher than the other two technologies named NEMETYL and Netzob.In the same experimental environment,the running time of the proposed method is less than 100 seconds under three data scales,while the running time of the other three methods is higher than that of the proposed method and increases sharply with the increase of the data scale.This paper proposes a rapid message type identification method based on data mining technology,which lays a solid foundation for the subsequent steps and practical application of protocol reverse.3.Aiming at the problem of message field segmentation of binary protocol,a message field segmentation method based on probability model is proposed.The proposed method first aligns the messages of the same type.Secondly,the statistical characteristics of the field boundary from the dimensions of the internal structure of the message and the value change between messages are analyzed.Then,a probability model is constructed to aggregate all the characteristics.Finally,the field boundary is generated and modified based on the probability calculated by the probability model.The experimental results show that the proposed method can achieve 79% accuracy,91%recall and 84% F1 value in boundary identification of message field,which are higher than the results of related technologies named NEMSYS,Pro Seg and NETPLIER.In the robustness testing,the effect of boundary identification and field segmentation of the proposed method decreases with the increase of the proportion of confused data,but it is still better than the other four related methods.This paper analyzes five statistical characteristics of field boundary and uses the probability method to synthesize all the characteristics for boundary judgment,which provides a reliable guarantee for the inference of protocol format specification.
Keywords/Search Tags:Network traffic analysis, Protocol format reverse, Unknown traffic clustering, Message type identification, Message field segmentation
PDF Full Text Request
Related items