Font Size: a A A

Research On The Key Identification Techniques For Discrete Sequential Protocol Message Based On Format Signature Extraction

Posted on:2018-11-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2348330563451269Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Current network traffic identification methods mainly take flows as identification unit and make the identification according to their flow statistical feature.But in practical situations where receiving conditions are usually limited,complete flows are quite difficult to receive.The received data are mostly discrete sequential protocol messages or damaged pieces of them.Compared with flows,discrete sequential protocol message have two defective conditions: lack of priori information and difficulty of feature extraction.Discrete sequential protocol messages are lack of flow information,which means their statistical feature can not be used and relation between messages constituting different flows is lost.Discrete sequential protocol messages are more chaotic because of the disorder of their own message array and part of them missing,which makes the data set ineffective.The result is that the extraction of format signature is more limited and difficult.However,the identification granularity is smaller to the message unit,which makes the problem more difficult to solve.Therefore,it is necessary to study the format signature extraction and message identification method for discrete sequential protocol messages.This paper aims at discrete sequential protocol messages received from practical network environment,proposes approaches to construct their protocol format signature and identifies them using messages as identification unit.The main work and innovations are summarized as follows.1.To solve the problem of fixed field format signature extraction and identification,a Discrete Sequential Protocol Message based Fixed Field Signature Construction(DSFSC)algorithm based on byte support is proposed.DSFSC algorithm improves the radius strategy of Density-Based Spatial Clustering of Applications with Noise(DBSCAN)algorithm to solve the problem of human experience requirement in searching radius setting.A frequent pattern mining algorithm based on byte position splicing is proposed to improve the mining efficiency.Finally,fixed field format signature is acquired by proposed filtering rules.Simulation results show that DSFSC algorithm does not depend on complete flows.The average precision of six protocols achieves above 95% when using single message as identification unit.Compared with AdapSig algorithm,the average precision of DSFSC algorithm is higher and achieves above 90% when using flows as identification unit.2.To solve the problem of variable field format signature extraction and identification,a Discrete Sequential Protocol Message based Variable Field Signature Construction(VFSC)algorithm based on byte statistics is proposed.VFSC algorithm clusters discrete sequential protocol messages by making use of the improved DBSCAN algorithm of DSFSC algorithm.Byte rate and byte discrete degree are introduced into Prefixspan algorithm so that variable value keywords with adaptive range can be extracted.And the problem of artificial partition is solved.Aimed at different types of variable field format signature redundancy,several heuristic filtering rules are proposed to acquire variable field format signature and VFSC algorithm achieves the granularity of messages unit.Simulation results show the average precision of VFSC algorithm for seven protocols achieves above 95% when using single message as identification unit.The precision is higher than Apriori algorithm.In identification of ACARS protocol,VFSC algorithm is validated to have the ability of discovering kinds of messages that are not in training set.3.To solve the problem of outline signature extraction and identification,a Discrete Sequential Protocol Message based Outline Signature Construction(OSC)algorithm based on character statistical distribution is proposed.Discrete sequential protocol messages are converted into binarization images by the established transformation model to make their outline feature prominent.Binarization images are clustered with the help of improved search-range-adaptive image clustering algorithm.The outline signature is acquired by distance-weighted decision algorithm.Finally,discrete sequential protocol messages are identified by comparing cosine similarity with outline signature.Simulation results show that the average similarity of five protocols' outline signature extracted by OSC algorithm reaches more than 80%.And the average recall of five protocols reaches more than 80%.It also shows OSC algorithm has the ability to resist certain noise at the same time.
Keywords/Search Tags:Network Traffic Identification, Discrete Sequential Protocol Message, Format Signature Extraction, Fixed Field Extraction, Variable Field Extraction, Outline Signature Extraction
PDF Full Text Request
Related items