| Control information and status information in Industrial Control System(ICS)need to be transmitted by ICS protocol.There are many private ICS protocols defined by manufacturers in ICS protocols,and private ICS protocols are more and more widely used.However,private ICS protocols rarely disclose protocol specifications,and lack of consideration of security in design,which leads to hidden security problems.To research the security of private ICS protocols,we need to understand protocol specifications,and protocol reverse engineering is an effective solution to recovery protocol specifications.The overall goal of this thesis is to extract the protocol format of private ICS protocol.By researching the network traffic clustering method of ICS protocol,the field format extraction method of ICS protocol and the semantic information extraction method of ICS protocol,combined with three methods to reverse analyze the network traffic of private ICS protocol,and finally extract the protocol format.In view of the differences between ICS protocol and traditional network protocol,and the problems of traditional reverse analysis technology based on network traffic protocol in reverse ICS protocol,the key technology research scheme of private ICS protocol reverse is proposed and the system implementation is carried out.The main work and innovation of this thesis include the following four parts:1.Research the classification of ICS protocol message types in this thesis,proposes Kmeans clustering algorithm with multi-index evaluation,integrates the internal evaluation indexes of clustering algorithm: DVI,DBI,SC,CH,SSE,and comprehensively selects the optimal clustering k value.By clustering the network traffic of the same private ICS protocol,the classification set of different types of protocol messages is obtained,so as to get the message types of private ICS protocol.Then,the reverse analysis of the same type of messages can be carried out to solve the impact of different types of mixed messages in ICS protocol on sequence alignment,so as to improve the efficiency and accuracy of sequence alignment.2.Research the problem of ICS protocol format extraction in this thesis,and proposes a field format extraction algorithm based on multi sequence alignment.According to the network traffic set of each message type,Smith-Waterman algorithm is used to get the similarity matrix,UPGMA algorithm is used to build the system boot tree,and Needleman-Wunsch algorithm is used for global sequence alignment,so as to extract the protocol format of each message.3.Research the semantic information extraction of ICS protocol field in this thesis,summarizes the common semantic information of public ICS protocol field,and designs a semantic information extraction method based on the obtained protocol field format to extract the corresponding fine-grained semantic information.4.Design and implement private ICS protocol reverse analysis system,through the reverse analysis of IEC104 protocol,Modbus-TCP protocol,TPKT protocol,extract its protocol format.Compared with other research schemes,the effectiveness of the proposed method is verified. |