Font Size: a A A

Research On Multi-mode String Matching Algorithm In Real-time Information Processing

Posted on:2018-08-13Degree:MasterType:Thesis
Country:ChinaCandidate:S P ShiFull Text:PDF
GTID:2358330518960437Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The string matching is one of fundamental problems in the computer science and engineer,and has been used in many areas.In fact,the string matching has been the performance bottleneck in many important filed,such as Computational Biology,Security Network,Information Filtering and so on.In the current,the flow of Internet has exceeded the scope of the agreement of Moore's Law with the rapid development of information technology.Thus,it is not realistic that improve the performance of above filed only by the way of hardware improvement.So we have to research the string matching algorithms that own the newer method,faster speed and better performance to increase the speed of matching in the real-time information processing.The article will research from the basic point,namely,the single pattern matching and the multi pattern matching.The single pattern string matching is the basic topic in the pattern matching and has a long history.The famous single pattern string matching algorithms include KMP algorithm?Knuth-Morris-Pratt algorithm?and BM algorithm?Boyer-Moore algorithm?etc.The worst time complexity of both is O?n?and O?mn?respectively.The multi pattern string matching not only increased the amount of pattern,but also increased the scope of application.Especially,the real-time information processing paid more attention to multi pattern string matching.AC algorithm is the most classic multi pattern string matching algorithm.The main work of this paper has been summarized as follows:First,the topics were summarized and analyzed that about the string matching algorithms researching.Advantages and disadvantages of traditional pattern string matching algorithm were summed up in the paper.Including exact string matching,regular string matching,approximate sting matching etc.The space-time complexity is the key performance parameter in the string matching,and has been analyzed.Second,the method of comparison with unit as integer was proposed,and several methods were combined to improve the performance of suffix string matching algorithms.The first is multi-windows mechanism and show the schematic diagram of the double windows and multi-windows for readers.And analyze the reason of increasing performance.The second is unaligned integer read mechanism.The paper analyzed the principle of the method and gave the operator method.The third is q-grams mechanism.And the advantages of the method were summarized.Some of the improvement algorithms have been presented based on above methods.Firstly,the serial of QSMI algorithms and the serial of TBMMI algorithms have been presented.Secondly,the improvement BMHq based on q-grams algorithm was presented.The results show that the best time complexity or space complexity can be reached in the improvement algorithms above.Third,this paper proposed a serial of improved BOM algorithm based on integer comparison.And refers to the construction method of the suffix based FO automata and UnrollingFO automata.The method of q-grams and mechanism of unaligned integer read be applied in the method of improvement.The q-grams method used to increase the distance of jumping,and the unaligned integer read mechanism simplify the manipulation of look-up table.The BOMq algorithm when q=3 was be presented.And a serial of EBOMsb algorithms and a serial of BOMqsb algorithms were presented respectively.The pseudo code of construction automata and BOM3,BOM32sb have been given.And show that the corresponding experiment results.Fourth,the article introduced the classical multi pattern matching AAC algorithm.The AACA?Advanced AC with Additive State?algorithm and AACN?Advanced AC with Negative State?algorithm were presented because of the large consumption of space and complex look-up table operation when mismatch of AAC.Although with the AAC algorithm in constructing the automaton way is the same,however,the AACA algorithm and AACN algorithm respectively to shift and change the corresponding state through to omit the look-up table operation,and finally confirmed by experiment,two kinds of improved multiple pattern matching algorithm can got better performance in the small scale pattern.Finally,the experimental results and experimental results are given to demonstrate the performance of the improved algorithm.
Keywords/Search Tags:Pattern Matching, Algorithm Design, Multi-Windows, q-grams, Integer Comparison, AAC Algorithm
PDF Full Text Request
Related items