Font Size: a A A

Research On Unknown Protocol Classification And Analysis Method

Posted on:2020-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:P H ZhuFull Text:PDF
GTID:2428330572473647Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous development and progress of Internet technology,there are more and more unknown application layer protocols.The traffic generated by these unknown protocols accounts for more and more of the interconnected traffic,which seriously threatens the network security.Current protocol analysis tools can classify and analyze known protocols,but can not be used for unknown protocols.Research on classification and analysis methods of unknown protocols is of great significance to network management and maintenance of network security.In this paper,the unknown application layer protocols of unknown protocol documents are studied.Three stages of unknown protocol analysis,unknown protocol classification,unknown protocol message type classification and unknown protocol format analysis are studied.A classification analysis method of unknown protocol is proposed.In the research of classification of unknown protocols,the classification accuracy is low and manual parameter adj ustment is the main problem.To solve these problems,this paper uses semi-supervised method based on statistical characteristics of protocol flow to classify unknown protocols.By introducing the average precision reduction index into the neural network classification model to select features,and then using the clustering method based on Canopy-Kmeans to classify unknown protocols without setting parameters manually.Compared with the current advanced classification methods of unknown protocols,the classification accuracy of unknown protocols is improved by 7%,5%and 2%respectively under different statistical characteristics of protocol flows,and the average running time of clustering methods is reduced by more than half.In the research of classification of unknown protocol message types,low classification accuracy is the main problem in current research.In this paper,through sliding windows of different lengths,frequent sequences of different lengths are extracted by data mining method,which makes up for the shortcomings of previous ngram-based methods.According to left and right entropy,feature weights are set,and DPCA-based clustering method is used to cluster unknown protocol message types.Experiments on standard data sets in the field of network intrusion detection show that the average classification recall rate of unknown protocol message types is 7%higher than that of current message type classification methods.In the research of unknown protocol format analysis,the low accuracy of format inference is the main problem.In this paper,multi-sequence alignment method is used to analyze unknown protocol formats.Firstly,the similarity calculation method of protocol data is studied.Considering the length factor on the basis of Levenstein distance formula,a new similarity calculation method is proposed,and the average similarity of protocol data sets is calculated.According to the different similarity of protocol data,different steps of unknown protocol format analysis are proposed based on DiAlign and T-Coffee algorithms,and the evolutionary tree generation algorithm in multi-sequence alignment method is improved to improve the accuracy of unknown protocol format analysis.Experiments on DEFCON CTF 2018 data set show that compared with Needleman-Wunsch based method,the proposed method can more accurately infer the formats of TFTP and HTTP/2 protocols.
Keywords/Search Tags:unknown protocol, protocol classification, reverse engineering, feature extraction
PDF Full Text Request
Related items