Font Size: a A A

The Research Of Multi-Pattern Matching Algorithms Based On Regular Expressions

Posted on:2013-09-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z YinFull Text:PDF
GTID:2248330371462025Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
With the popularity and development of computer and Internet technology, network plays anincreasingly important role in our daily life, but the following problems of network security areoutstanding day by day too. As an important protection measure to guarantee network security,intrusion detection system has been largely popularized and applied. As one of the key technologiesof intrusion detection system, the improvement of the efficiency of pattern matching algorithm isessential to increase the testing ability of the system because that the performance of patternmatching relates to the efficiency of intrusion detection system.In this paper, the intrusion detection system is introduced and the application of patternmatching algorithms in the system is analyzed. Meanwhile, AC algorithm, AC_BM algorithm andWM algorithm are also discussed in detail. However, along with the development of networktechnology and the increment of complexity of rule sets, the traditional string matching engines aregradually replaced by more advanced regular expression engines.The matching engines of regular expressions are mainly based on NFA and DFA. Thematching speed of NFA engine is slow, but needs small storage space. DFA based matching enginehas innate advantage in speed, but consumes too much space, and with the scale of the rule setsincreasing, much greater storage space is demanded. An efficient combination algorithm for DFAsof regular expressions from reducing the state number of the corresponding automaton is proposed,which is called COM_DFA algorithm. Each regular expression is handled separately in theconstruction process of the automaton, which can avoid interaction and overlapping between statesand transitions when combining the automaton, therefore reducing the state number of theautomaton greatly. Meanwhile, combine DFAs byε-transition to make sure that we can obtain anautomaton with minimum state number, and therefore reduce the storage space. Finally, a variablecalled CR (Compressed Rate) is introduced to describe the compressed rate of the combinationalgorithm compared with the traditional algorithm. Experimental results show that the algorithm hasreached good compression effectiveness of the state number of DFAs.At the same time, a regular expression matching algorithm (called D_N algorithm) is proposed,which is based on a DFA-NFA structure finite automaton. In order to solve the contradictionbetween the matching speed and memory requirement existing in NFA and DFA, DFA-NFAstructure is applied to realize the construction of the automaton, handling critical state which maycause space explosion of DFA separately so as to reduce the storage needs. Because that only a fewdata in the network are malicious, while most data are normal, the design that DFA part is before NFA part can realize the function of data filtering, thus speeding up the matching process of thealgorithm. Meanwhile, for the critical state at the boundary of DFA-NFA structure, different priorityis seted for the corresponding transition which labeles the same input character. Therefore, inmatching process, it is easy to determine the next state transferred from current state at the inputcharacter through checking the priority, thus accelerating the matching speed. The test resultsindicate that the memory requirement and matching efficiency of D_N algorithm have been greatlyimproved compared with traditional algorithm.
Keywords/Search Tags:pattern matching, intrusion detection system, regular expression, finite state machine
PDF Full Text Request
Related items