Font Size: a A A

Research Of Unknown Binary Protocol Reverse Engineering Based On Network Trace

Posted on:2019-06-22Degree:MasterType:Thesis
Country:ChinaCandidate:K LuFull Text:PDF
GTID:2428330596460615Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Network protocol reverse engineering is a way to analyze and process the network data or monitor the operation of the server and client in the network,reversely parse the field format of the protocol,analyze the field semantics and construct the protocol state machine without knowing the protocol specification.Offence is the best defense.Network protocol reverse engineering is of great significance to the monitoring and management of network,the security environment of network and the improvement of network service quality.With the growing popularity of smart home and wearable devices,including simple UAVs,more and more unknown protocols have sprung up around us.Unknown protocols used by products produced by various manufacturers exist in the form of text or binary unknown traffic on the network.The goal of this paper is to classify and extract the format of the binary protocol data which is captured in the network traffic and whose protocol specification is unknown,without introducing any prior knowledge about the protocol specification.The main work and innovation of this article include the following sections:1)After capturing the traffic data,algorithms and schemes in the relevant references are reproduced.Through the steps of load extraction,error checking,feature generation,feature selection and so on,we preprocess the traffic data captured under different conditions,and finally provide data sources for format extraction.2)An improved method of protocol format extraction based on frequent item mining is proposed.Aiming at the problem that the way of constructing frequent items by bytes in the existing schemes cannot be completely applied to the problem of the binary protocol and these schemes lack the steps of format extraction,we propose a new scheme.First construct the maximum item sets which the shortest length of item is nibble,then use the supportiveness and the location entropy to filter frequent items.After that,the voting based on positive maximum matching in NLP is used to complete the format extraction.The accuracy of the experiment is not less than 75% and the accuracy of the experiment is not less than 90% through the simulation experiment.3)An improved method of protocol format extraction based on sequence alignment is proposed.As classical schemes have the problem that the time complexity is too high due to the hierarchical cluster and the sequence alignment which uses Needleman-Wunsch algorithm often leads to many insertion of spaces,which cause the problem of position slippage.We propose a new scheme.First set a result sequence and a result function corresponding to the result sequence,then calculate the similarity of every two sequences and merge the results into one mixed sequence after the sequence alignment.After that,output the comparison result to the result function according to the similarity degrees of the two sequences.The format can be extracted by the result function.The accuracy of the experiment is not less than 80% and the accuracy of the experiment is not less than 90% through the simulation experiment.4)Using the Java language,with the help of the Spring Boot back-end framework and the Bootstrap front-end framework a B/S prototype system is made to visualize the results of the two schemes above.
Keywords/Search Tags:Network Flow, Binary Protocol Reverse Engineering, Frequent Item Mining, Sequence Alignment, Format Extraction
PDF Full Text Request
Related items