Font Size: a A A

Research On Storage Optimization Technology Of Regular Expression Matching

Posted on:2016-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ShaoFull Text:PDF
GTID:2308330482976814Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Internet security issue has become a major event related to our national economy and the people’s livelihood. Intrusion Detection System (IDS) based on Deep Packet Inspection (DPI) technology, which is an important means of network protection, has been widely used in recent years. Regular Expression (Regex) is the key algorithm of IDS because of its powerful function and flexible mode. With the network offence and defence improving, attacks mode complicating, network service and applications rapidly changing, Regular Expression Matching (REM) is no longer suitable. Designing high speed and low storage REM algorithm become more and more important.Regular Expression Matching algorithm based on DFA has advantage of it linear matching speed, but state and storage space explosion may occur when some type of regular expression translate together. To solve this problem, this paper analyzes and improves the current algorithm based on DFA construct processing, and design high speed and low storage REM algorithm by optimizing storage usage. The main work of this paper is as follows:Based on the symmetry of multi-dimensional structure, The MFA algorithm can update the current state with the Axes and position information real time. As a result, it decreases significantly the storage spaces. Because of MFA’s low signature fraction of coverage, we design the eXtend Multi-dimensional Finite Automata (XMFA) algorithm combined with the character of Kleene Closure in IDS. We make structural readjustment for the structure which is not multi-dimensional symmetrical, and we add some auxiliary variable. As a result, we can construct XMFA with more type of signature. We design state indicator to decrease the update time.Theoretical analysis and experiment results show that XMFA has comparative matching time comepared with DFA, and it has better fraction of coverage and linear storage space.As the group number growing, the classical grouping algorithm solves the DFA state explosion with a big decrease on time efficiency. Based on Regex input drive theory, this paper presents a grouping algorithm, Drive Character Finite Automata (DCFA). DCFA divides regular expression sets based on signature type and constructs matching engines in each gather. These are no space explosion by the kind of grouping algorithm. Experiment results show that, compared with classical grouping algorithm, the proposed algorithm solves the badly decrease of the matching efficiency. Compared with other classical DFA improved algorithms, the preprocessing time and storage space are reduced by 2-3 orders of magnitude, and without no obvious decrease on matching efficiency.By analyzing the type features of PCRE, we design a REM algorithm based on DCFA and XMFA, which is Templates based Finite Automata (TFA) and could be used in ture IDS. We design regular grouping temples according to IDS regular type features, and based on signature templates we divide regular expression sets to some sub-sets. Every sub-set constructs a high speed and low storage matching engine. Theoretical analysis and simulation results show that, comparing with Hybrid-FA and D2FA algorithms, the preprocessing time of TFA is reduced by several orders of magnitude, and the matching time of TFA has same level orders of magnitude.
Keywords/Search Tags:Intrusion Detection System, Regular Expression Matching, Deterministic Finite Automata, eXtend Multi-dimensional Finite Automata, Grouping Algorithm, Signature Driven Character, Drive Character Finite Automata, Templates Based Finite Automata
PDF Full Text Request
Related items