Font Size: a A A

Classification Of Bit-stream Unknown Protocol

Posted on:2017-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:H C ZhouFull Text:PDF
GTID:2308330485486585Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Today, the internet and other telecommunication are becoming increasingly popular and developed. In the process of telecommunication, the protocol is more and more numerously used of which a considerable part of the unknown protocol does not publish its’ development documentation. These protocols can’t be analyzed by protocol analysis tool. Yet these unknown protocols might have a higher threat to network and information security. It is a very important significance for those managers who have higher requirements of cyber security to identify and resolve these unknown protocols as much as possible.Aiming to identify completely unknown protocol,this paper proposes a solution based on Data Mining unknown protocol classification. The subjects were captured bit-stream unknown protocol data frames. The goal is to classify, characterize and utilize these frames according to the type of protocol. Based on the theory of clustering and classification, this study pre-processes the unknown protocol data frame. Then improved K-Means clustering algorithm and improved clustering algorithm AGNES were used to cluster. Both clustering algorithm offer different solutions to solve the problem of clustering protocol, and each method has its own advantages and application of occasions。After that, results of clustering are effective assessment, and class clusters were labeled, categorized which has a better evaluation results. At last, protocol classification model was designed based on Bayesian theory. The model learned clustering results produced by the K-Means or AGNES and became useful to identify unknown protocol. K-value and the Initial-cluster-centers calculation method were designed in order to compensate for the lack of traditional K-Means Algorithm. This benefit is that it’s no need to specify the K-value of K-Means Algorithm, and it is no longer randomly select the initial cluster centers. So, availability and accuracy has been greatly improved. Evaluation of program based on information entropy was designed to evaluate the clustering result. For the lack of a traditional AGNES, The feature of the bit-stream data frames was combined with this algorithm. The similarity between the data frames and also between clusters was defined by two different ways. An effective method that extracted clusters which meet the requirements of the class cluster while the process of clustering was used. Protocol data frames could be clustered quickly and effectively, without inputting the number of clusters. And a similarity evaluation was contained in the results of class clusters. Unknown protocol classification model was obtained by learning the clustering results which based on Bayesian machine algorithm. Firstly the model converts bit-stream protocol from binary to hexadecimal, cuts frame into data units, and counts the frequency of every data unit. Then, it will get a protocol identification model by using new algorithm based on Bayesian theory to learn the training data. This model can identify the unknown bit-stream protocol quickly and efficiently.This study conducts experiments by Lincoln Laboratory published data sets. The results show that the improved clustering and classification models have higher accuracy for unknown protocols clustering and classification with a strong practical.
Keywords/Search Tags:Unknown protocol classification, Bit-stream unknown protocol, K-Means, AGNES, Bayesian theory
PDF Full Text Request
Related items