Font Size: a A A

Research On High-throughput String Patterns Matching Algorithm

Posted on:2013-12-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:K P JiangFull Text:PDF
GTID:1228330395980621Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Pattern matching not only is the basis of computing theory, but also has been widely appliedin computer system and network processing. With the information explosion and the rapidincrease in network bandwidth, the requirement of both information query and network securityare bound of wire-speed processing of network data. Detailed researches are done about patternmatching in this thesis. The main works and innovations are as the following:1. A series of mathematical formula was established to solve the problem of mathematicalproof for the NFA algorithm studying. On the one hand, the mathematical formula proved thecorrectness of our algorithm, and the equivalence between the proposed algorithm and NFA. Onthe other hand, so far, the proofs of NFA in published literature were illustrative, and had notestablished the appropriate mathematical formula. This paper presented mathematical definitions,theorems which establish the preliminary basis for mathematical proof of the NFA. Finally, dueto the establishment of mathematical formula, the article laid the foundation for theestablishment and derivation of the new NFA.2. The paper proposed a novel static pattern matching algorithm. The algorithm designedtwo operations which are decoding matrix decoding the text and the extraction of patterncomparison results. On the one hand, the operations reduced the resources occupied by theregular expression to achieve the comparison operation between a large number of patternsymbols and text symbols. On the other hand, the operations accelerated the computational speedof the comparison between a large number of pattern symbols and text symbols. The operationsresolved the problem of high resource consumption and running slow of a large number ofcomparison operations.3. The paper proposed the "vector-and" algorithm. The algorithm transformed theconcatenation of regular expressions into a simple bitwise and operation. The most commonlyoperation of regular expressions is the concatenation. Each regular expression includes a lot ofconcatenation operations, and therefore to accelerate the computing speed of the concatenationoperation plays an important role to improve the regular expression operation speed. Theproposed algorithm expression transformed all of the operations of regular expression intobitwise operations, thus greatly improved the overall computing speed.4. The proposed algorithm supports not only the common operations of regular expression,but also the complementation operation. Complementation operation has been the puzzle ofkinds of regular expression algorithms. The article transformed the complementation into thecommon concatenation by the appropriate structure, therefore, resolved the puzzle of the complementation operation in regular expression implementation.5. The paper proposed extended decoding matrix. The scheme completely resolved theunion, intersection, and complementation operations of single symbol of regular expression.Established a set of mathematical formulas, and proposed novel algorithm, and verify thenovel algorithm on the FPGA, the article showed that: The proposed algorithm made full use ofFPGA features, and greatly increased the regular expression throughput. So far, the maximumthroughput in the open literature is less than40Gbps, and the maximum throughput ofcommercial implementation does not exceed20Gbps. The final chapter in this article showed themaximum throughput of the proposed algorithm. The experiment showed that: the throughput ofthe proposed algorithm can be512Gbps on the existing FPGA. The throughput has been far morethan the maximum bandwidth of modern networks. In addition, the proposed algorithm usedonly the FPGA logic resources, do not use other resources, and therefore was not affected by theconnection and the transmission between the FPGA and the external devices, also increased thescalability of the system, and can increase the number of the pattern symbols in the entire systemby using more resources FPGA and increasing the FPGA number.
Keywords/Search Tags:Pattern matching, High throughput, Static pattern matching, Regular expression, Nondeterministic Finite Automata, Field Programmable Gate Array
PDF Full Text Request
Related items