Font Size: a A A

Research On Optimization Of DFA In Regular Expression Matching

Posted on:2009-12-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y K QinFull Text:PDF
GTID:2178360272491874Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With rapid development of Internet, traditional stateful firewall could not provide enough protection against application-level attacks. More work need to be done on application layer rather than network layer, and finally, DPI (Deep Packet Inspection) techonology appears. DPI examines not only the header but also the contents of packets, so it could find the signatures hidden in the payload.As DPI develops, traditional exact string matching is replaced by regular expression matching. Regular Expression, which is more powerful and flexible than string, could present complex signatures very easily. More and more firewalls and IDS (Instrusion Detection System) start using regular expressions as their patterns.There are two kinds of regular expression matching engine,NFA based and DFA based. Network applications could benefit more from using DFA engine than NFA engine. Although DFA engine has an excellent matching performance, it suffers from two problems. One problem is that it consumes too much memory space; the other is that it has bad performance when there are too many patterns.Based on the research and analysis of current optimization techonology of DFA, this paper addresses EasyDG, a DFA group algorithm, and SmallAlpha, an Alphabet compression algorithm. EasyDG could improve the performance of DFA engine even there are many patterns; SmallAlpha could reduce the space required by DFA engine considerably. Involving the two algorithms, this paper then designs and implements SmartRE, a DFA based regular expression matching engine. SmartRE could archieve faster matching speed and require less memory space than normal DFA engine. Moreover, this paper designs and implements Jupiter system, an application layer protocol analysis system. By using SmartRE as its key module, and using the matching strategy of fast path and slow path, Jupiter is 70 to 80 times faster than L7-filter.
Keywords/Search Tags:lexical analysis, regular expression, optimization of DFA, DFA Group, protocol analysis of application layer
PDF Full Text Request
Related items