Font Size: a A A

Research And Implementation Of Unknown Format Extraction For Binary Protocol Reverse Engineering

Posted on:2017-12-18Degree:MasterType:Thesis
Country:ChinaCandidate:S Y TaoFull Text:PDF
GTID:2428330596459971Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Without apriori knowledge of protocol specifications,network protocol reverse engineering processes undocumented protocols of network trace or execution moniter to automatically deduce message formats,semantic information and state machine.The private network protocol reverse techniques play important roles in mastering network behavior features,maintaining healthy network operations,enhancing high-quality network services and building secure network environments.Format extraction is a principal process of protocol reverse engineering.However,in recents years,format extraction of binary fields has been a new challenge of protocol reverse engineering,in which approaches traditionally have dealt with text-based protocols rather than binary protocols.Compared with current methods,in order to analyze binary protocol fields,the field definition of binary protocols often consists of several bits containing a variety of field length attributes,which means that binary field features require the bit-oriented precision hardly achieved by traditional methods;the field definition of binary protocols rarely uses character encoding including a variety of field value attributes,which means that it causes transparent fields of no delimiters.Therefore,both the complex binary-type field features and the scarce prerequisite of binary fields have resulted in traditional methods inapplicable to the field extraction of binary protocol reverse engineering.Based on network trace,this paper focuses on the technique of binary field extraction for binary protocol reverse engineering.To solve above problems,this paper built the theorical model of binary-type field extraction based on the statistical distribution to parse the variety of the field value,presented approaches based on the dynamic characteristics and static characterisitcs of protocol fields to extract binary-field boundaries,and designed a prototype system to demonstrate the performance.The main work and innovations are summarized as follows.1.Targeted on the binary field features,a theoretical model based on conditional random fields is invoked to analyze the binary field boundaries.The differences between binary and text-based fields are described by the formal language.Then,a statistical model based on conditional random fields is raised.In the parameter determination of the model,auto-regressive moving average model is used to analyze the feature template and weight coefficient.In the parameter calculation of the probability model,a forward-backward algorithm is exploited to compute the posterior probability distribution of field features.In the prediction of the probability model,an optimal estimation of probabilistic distribution is performed for the accurate delimitation with concluding the critical problems.2.Aimed at the binary field extraction of binary protocol reverse engineering,an approach called BinPI based on the variable field values is presented to extract the binary field boundaries.Firstly,a silhouette coefficient is imported into the hierarchical clustering to confirm the optimal clustering number of binary frames.Secondly,a modified multiple sequence alignment algorithm,in which the matching process and back-tracing rules are redesigned,is also proposed to analyze binary field features by gap features.Finally,a Bayes decision model is invoked to describe field features and determine bit-oriented field boundaries.The maximum a posteriori criterion is leveraged to complete an optimal protocol format estimation of binary field boundaries.Experimental results indicate that BinPI's coverage in discovering actual fields,correctness in associating actual and inferred fields,and closeness of actual and inferred bit positions are at least 70%,75%,and 85%,respectively.3.For the binary field extraction of binary protocol reverse engineering,an approach called BinFIM based on the invariant field values is proposed to demetermine the binary field boundaries.A variance analysis firstly leads in frequent item mining,based on location distributions of ergodic fragments.A strong association with the field boundary characteristics can be retained through key frequent item sets of ergodic lengths.Then,based on revised Bi-directional maximum matching,the author designs the vote distribution algorithm to generate the characteristics of the vote positions related to field positions.Finally,a bit-oriented field boundary determination algorithm based on the modified ant colony algorithm is built and the binary field boundaries are optimally estimated and precisely delimited.Simulations in a real environment show that BinFIM's coverage in discovering actual fields,correctness in associating actual and inferred fields,and closeness of actual and inferred bit positions are at least 70%,70%,and 80%,respectively.4.A prototype system of BinPI and BinFIM is implemented to serve the network management and application security.Given the theoretical research and experimental demonstration,the solution of software implementation for binary protocol reverse engineering is attained to infer the specification of binary protocols from actual traffic traces.Both the system module architecture and functional interface are introduced,and the requirements of the prototype system are verified.The results show that the system effectively extracts binary fields and outperforms the existing algorithms,while it is not only compatible with the traditional protocol reverse issue but competent for binary protocol reverse engineering as well.
Keywords/Search Tags:Protocol Reverse Engineering, Network Trace, Binary Protocol Reverse Engineering, Format Extraction, Field Extraction
PDF Full Text Request
Related items