| Application layer oriented fine-grained network protocol identification technology for is widely used in various kinds of network security application scenes, such as internet behavior management, network traffic analysis and control, and next-generation firewalls, etc. While, as the growth in the number of network applications, the size in protocol feature set also increases dramatically. Therefore, how to process fine-grained identification on plenty of application layer protocols in high speed network on the premise of assuring the protocols’accuracy rate is a main research direction in protocol identification fields.In view of the state diagram explosion problem arising in structuring the regular expression protocol rule library, which is due to the growth of the number of the protocol characteristics, this article mainly studied on the matching principles of regular expression protocol identification algorithm and causes of the explosion problem, extracting and analyzing on the application layer protocol features in high speed network, designing and realizing new protocol identification algorithm based on the improvement of the AC algorithm, and this new algorithm was also verified. The research contents and innovation of this article were as follows:Extract, analyze and summarize rules and features from100typical application-layer protocols.Through capturing and analyzing about100various network application behaviors, and comparing data packages captured in different situations, protocol characteristics in regular expression form were concluded, and through further summary of protocol characteristics, and comparation with common regular expression, we illustrated the unique features of regular protocol characteristics.Propose and design algorithm matched with regular protocol characteristics which was based on improved AC algorithm. On the basis of analyzing AC algorithm theory, algorithm overall structure was proposed which conducted regular protocol identification upon the twice improved AC algorithm, besides, improved algorithm for constructing protocol rules and protocol matching algorithm was designed in detail.This research achieved improved algorithm based Java, and also tested and verified the effectiveness as well as correctness of this algorithm. This study presented an algorithm upon Java, mathematically analyzed the complexity of algorithm in space and time, and then compared protocol rule libraries based on100protocol characteristics that are constructed by applying improved algorithm and by using protocol identification algorithm based on D2FA, which at last testified that this algorithm could effectively solve the problem on explosion while constructing the rule library with massive regular protocol.This paper put forward a new idea for the match of regular expressions by combining specific situations of application. This improved string matching algorithm could be used to match regular protocol characteristics, which effectively solved the problem that a protocol rule library could not be constructed with massive regular protocol characteristics. It also provided a new methods for analyzing similar problems. |