Efficient Representation And Implementation Of Position Automata

Posted on:2019-05-05

Degree:Master

Type:Thesis

Country:China

Candidate:B Y Wang

Full Text:PDF

GTID:2428330548959149

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

The need for regular expression matching has always existed.The general scheme is to construct the corresponding nondeterministic finite automata(NFA).The mainstream nondeterministic finite automata NFA is usually divided into two categories.One is the Thompson architecture,proposed by k.Thompson in the 1960 s.The Thompson architecture has a simple idea,but there are a lot of redundant states,and empty transitions(e-transiton),A typical Thompson architecture includes about m to 2m states(where m is the length of the regular expression).There is also a Glushkov architecture.In contrast,the Glushkov architecture has the advantage that there is no empty transition and the number of states equals the length of the regular expression plus one.There is no empty transition to make Glushkov architecture transfer less than the Thompson architecture,the number of states equal to the length of the regular expression plus one,so that the state of the automaton and the regular expression can easily establish the relationship between the position,Therefore,the Glushkov automaton is also called a position automaton,and bit-parallelism can be used to achieve active state set transfer.In addition,similar to the Glushkov Follow Automaton there is no empty transition,Follow Automaton according to the state of the Follow set element composition of the Glushkov automaton partof the state merger,with fewer state numbers,so in theory have more than the Glushkov automaton Good performance.According to the above properties,these latter two automata are collectively considered as position automata.The main work of this article is divided into two parts:on the one hand improving the bit parallel matching method of Glushkov automaton.Due to the properties of the Glushkov state,this paper divides the state bits of the automaton into several categories for processing,and improves the Navarro-Raffinot method(NR-method)for the bit parallel matching method of general Glushkov automaton.In the NR-method,the update process of status bits has room for improvement.Use the "diffuse-function and extract-function" and "expanded bit-vector" methods to improve the process of updating the set of activation states at each step.Using diffuse and extract functions can complete a word length matching in a constant time,and the bit extension method can also implement bit parallel matching in a constant time for a general position conflict vector.On the other hand,compare the matching efficiency of Glushkov and Follow automata.For some special cases,the Follow Automaton has better performance than the Glushkov Automaton.But from the overall point of view,when the number of regular expression substrings is small,the upper bound of the state number of the states of the Follow Automaton is smaller than that of the Glushkov automaton,which means that the average performance of the Follow Automaton is better than that of the Glushkov automata at this time.With the number of regular expression substrings gradually increases,the upper bound of the state number means of the Follow Automaton is basically the same as the upper bound of the mean value of the Glushkov automata,which means the average performance of the two types of automata is the same at this time.

Keywords/Search Tags:

Bit parallel, Regular expression matching, Position automata

PDF Full Text Request

Related items

1	The Research On Time And Space Efficient Regular Expression Matching Algorithm
2	Research On Multi-dimensional Regular Expression Matching Algorithm For Network Security
3	Research On Storage Optimization Technology Of Regular Expression Matching
4	The Design And Implementation Of Regular Expression Engines Based On Deterministic Finite Automata
5	The Research On Regular Expression Matching Algorithms For Deep Packet Inspection
6	Regular Expression Matching With Multi-step Speculation
7	Research On Storage Oriented Regular Expression Matching Algorithms
8	An Improved Algorithm Of The Regular Expression Based On DPI System
9	Research On Parallelizing Technique Of Regular Expression Matching Algorithms
10	Research On Automatic Generation Methods Of Regular Expression Matching Engines On FPGA