Font Size: a A A

Research On Semantic Inference Algorithm And Key Technologies Of Binary Protocol

Posted on:2022-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:2518306605965299Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the continuous development of network technology,in order to meet the diversified information transmission needs of users,the types of communication protocols are increasing rapidly.In the absence of protocol design specifications,Protocol Reverse Engineering can extract message feature fields,identify semantics and finally output protocol state machine by analyzing network protocol messages.Protocol Reverse Engineering is of great significance for ensuring network security,preventing network attacks and improving the efficiency of network resource management and control.In order to improve the efficiency of communication and ensure the quality of service in different scenarios,private protocols mostly adopt the binary bit oriented design method,and do not include the delimiter.The message format is often only agreed by both sides of the communication,which has the problems of fuzzy field characteristics and difficult to distinguish field semantics.Therefore,when the protocol specification is unknown,how to accurately extract the characteristic fields and identify the semantics of each field is the main challenge of binary protocol reverse technology.This thesis aims at the technical research on the problems of binary protocol feature field extraction and semantic representation rule missing.The main achievements are as follows:This thesis summarizes the technical path and development trend of binary protocol reverse,analyzes the research status and limitations of feature field extraction algorithm and semantic inference algorithm,and combs the main challenges of binary protocol semantic recognition.The challenges include:how to make full use of the prior principle of protocol design in feature field extraction to mine potential features of message format and improve the accuracy of feature field extraction;how to carry out semantic representation,semantic recognition and design semantic extension rules for extracted feature fields to improve the accuracy of semantic recognition;how to build an integrated protocol reverse platform to improve the accuracy and efficiency of protocol reverse.A hierarchical search method of "length" feature field based on message length comparison is proposed,which solves the problem of extraction of indefinite length feature field.According to the principle of protocol design,this method uses the length based framing mechanism to search the "length" characteristic field layer by layer,determine the sub frame boundary,analyze the message one by one and layer by layer,and effectively improve the accuracy of extracting the message characteristic field.Through the test of TLS protocol and TLV protocol packet data set,the accuracy of the proposed algorithm reaches 90%.A feature field semantic inference algorithm based on Bayesian classification is proposed to improve the recognition rate of unknown protocol semantics.The main steps are as follows:the framework of feature field semantic recognition technology of self extended semantic database is designed to form a closed-loop of protocol semantic representation,semantic storage and extension,and semantic recognition,and support the extension of unknown protocol library.According to the principle of protocol design,the characteristic field properties are identified.The semantic inference algorithm of feature fields based on Bayesian classification is constructed to distinguish the semantic of feature fields.Through the test of TCP protocol,quic protocol and other multi type and multi-layer protocol message samples,more than eight kinds of semantic types are identified,and the accuracy rate of feature field semantic recognition is up to 90%.The main components of the platform include:sample generation module,protocol reverse engineering module,management monitoring module,which support feature field extraction and semantic inference.The platform provides graphical interface and human-computer interaction model,and integrates deep learning framework to support the research and development of protocol reverse engineering.
Keywords/Search Tags:Binary Protocol, Protocol Reverse Engineering, Feature Fields Extraction, Semantic Inference
PDF Full Text Request
Related items