Research On Regular Expression Matching Of Network Data Flow

Posted on:2016-12-15

Degree:Master

Type:Thesis

Country:China

Candidate:R Yang

Full Text:PDF

GTID:2348330542975821

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Regular expression is a pattern string contains wildcard characters and ordinary,it has a very flexible expressive power,its rich and powerful ability to express the semantics given its ability to describe the various features and effective,it is this ability makes regular expression matching web content filtering technology accounted for analyzing systems and network intrusion detection system core low.With the rapid development of the Internet,especially the rise of mobile Internet,blowout of new network applications,network information is growing exponentially,resulting in the amount of data to be detected,and the rapid growth of the number of detection rules,which match the regular expression matching technology performance presented a huge challenge.The researchers currently abroad on regular expression matching technology research focused on matching efficiency and space to convert it into an automatic machine matching storage in two ways.This paper focuses on the regular expression matching algorithm to match the efficiency of in-depth research,the main research work includes the following two parts:Time to study the space-efficient inefficient non-deterministic automata(NFA)matching algorithm,based on the NFA Glushkov construction improvements proposed active filter based on regular expression matching algorithm,to be matched by more than one character is loaded into the automatic machine to reduce the size of the active set,reducing the number of ways to enhance verification Glushkov NFA-based regular expression matching algorithm matching efficiency.Study the distribution of regular expressions and the actual structure of the target data,and Internet-based suffix search string matching algorithm is proposed based on pre regex matching algorithm.Because the actual data in the target Internet only have a small proportion of data,many of which are irrelevant data,the original regular expression matching algorithm successively treat all of the matching data match which reduces the efficiency of matching.Based on the suffix search string matching algorithm proposed algorithm determines by extracting strings from regular expressions,using a jump characteristic of looking for suspicious data in the data to be matched,and then verify.Because the process of looking for suspicious data filtering a lot of irrelevant data,onlyaccount for a small percentage of suspicious verify load data into automated machine,therefore,the efficiency of the algorithm has been greatly improved.In summary,this paper summarizes the current research and regular expression matching algorithm based on the focus of the regular expression matching algorithm to match the efficiency of research,proposed optimization program,and through experiments to verify the feasibility of the algorithm,the final the article also prospected regular expression matching algorithm future trends.

Keywords/Search Tags:

regular expressions, automaton, Glushkov NFA, stateful filtering, pre-screening

PDF Full Text Request

Related items

1	Heterogeneous State Transition Method Based On Improved Glushkov Automata
2	Research On Technique Of Application-Layer Protocol Identification Based On Regular Expressions
3	Deterministic automata for streamed XML validation
4	Research On Regular Expression Matching Algorithm
5	Study On Automaton-Based Regular Expression Matching Algorithms
6	Design And Implementation Of Network Traffic Monitoring System
7	The Design And Implementation Of Regular Expression Engines Based On Deterministic Finite Automata
8	Research And Implementation Of Protocol Identification Based On Regular Expression
9	The Properties And Regular Expressions Of Two Types Of Fuzzy Finite Tree Automata
10	The Design And Implementation Of The Netfilter-Based Content Filtering System