Font Size: a A A

Research On Clustering Method For Unknown Protocol Recognition

Posted on:2021-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:S Y LinFull Text:PDF
GTID:2518306107450184Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Network protocol identification refers to classifying a protocol according to the representative characteristics contained in the protocol or traffic generated by an application,and determining the application layer protocol to which the protocol belongs.It is of great significance to develop an efficient identification protocol method to improve the security of the network environment and the management of network communications.However,various unknown and undisclosed communication protocols are emerging in the network environment,which makes most of the existing network protocol identification methods no longer meet the actual needs.Therefore,the identification of unknown protocols has become an urgent problem in the field of protocol identification.For the identification of unknown protocols,clustering methods in unsupervised learning algorithms have become one of the hot research topics.Combining with the traditional load-based protocol identification method and using the application layer protocol data as the research object,the paper proposes an adaptive clustering unknown protocol identification method based on the AGNES hierarchical clustering algorithm based on the predecessors.This method is based on the load characteristics of the protocol data and classifies the protocols based on similarity.First,in the preprocessing of the data,the unidirectional data flow of the application layer protocol is obtained through stream reorganization.For the text-based protocol,the slicing operation is used to effectively filter out the impact of the noise data on the calculation of similarity,and the similarity between the slice sets is used to replace the protocol The similarity between data is clustered according to the similarity to obtain the final cluster set,and the result contains the similarity value as the evaluation criterion of clustering effect.The paper also improves the similarity calculation method in the AGNES algorithm,and divides the similarity calculation into two parts before clustering and clustering.This division effectively avoids the repeated calculation of similarity between application layer protocol data and improves the efficiency of the algorithm.This method overcomes the shortcomings of traditional protocol recognition algorithms requiring prior knowledge,and does not require a training process,which improves time efficiency.It also overcomes the shortcomings of most clustering algorithms that need to determine the number of target clusters in advance.The method proposed in this paper can adaptively determine the number of target clusters.Experiments show that the improved clustering algorithm can achieve the purpose of identifying unknown protocols,and the efficiency has been significantly improved compared with the algorithm before the improvement.
Keywords/Search Tags:Unknown protocol, Protocol identification, Hierarchical clustering, Similarity calculation, Network management
PDF Full Text Request
Related items