Font Size: a A A

Research On Regular Expression Matching Of Network Data Flow

Posted on:2016-12-15Degree:MasterType:Thesis
Country:ChinaCandidate:R YangFull Text:PDF
GTID:2348330542975821Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Regular expression is a pattern string contains wildcard characters and ordinary,it has a very flexible expressive power,its rich and powerful ability to express the semantics given its ability to describe the various features and effective,it is this ability makes regular expression matching web content filtering technology accounted for analyzing systems and network intrusion detection system core low.With the rapid development of the Internet,especially the rise of mobile Internet,blowout of new network applications,network information is growing exponentially,resulting in the amount of data to be detected,and the rapid growth of the number of detection rules,which match the regular expression matching technology performance presented a huge challenge.The researchers currently abroad on regular expression matching technology research focused on matching efficiency and space to convert it into an automatic machine matching storage in two ways.This paper focuses on the regular expression matching algorithm to match the efficiency of in-depth research,the main research work includes the following two parts:Time to study the space-efficient inefficient non-deterministic automata(NFA)matching algorithm,based on the NFA Glushkov construction improvements proposed active filter based on regular expression matching algorithm,to be matched by more than one character is loaded into the automatic machine to reduce the size of the active set,reducing the number of ways to enhance verification Glushkov NFA-based regular expression matching algorithm matching efficiency.Study the distribution of regular expressions and the actual structure of the target data,and Internet-based suffix search string matching algorithm is proposed based on pre regex matching algorithm.Because the actual data in the target Internet only have a small proportion of data,many of which are irrelevant data,the original regular expression matching algorithm successively treat all of the matching data match which reduces the efficiency of matching.Based on the suffix search string matching algorithm proposed algorithm determines by extracting strings from regular expressions,using a jump characteristic of looking for suspicious data in the data to be matched,and then verify.Because the process of looking for suspicious data filtering a lot of irrelevant data,onlyaccount for a small percentage of suspicious verify load data into automated machine,therefore,the efficiency of the algorithm has been greatly improved.In summary,this paper summarizes the current research and regular expression matching algorithm based on the focus of the regular expression matching algorithm to match the efficiency of research,proposed optimization program,and through experiments to verify the feasibility of the algorithm,the final the article also prospected regular expression matching algorithm future trends.
Keywords/Search Tags:regular expressions, automaton, Glushkov NFA, stateful filtering, pre-screening
PDF Full Text Request
Related items