Font Size: a A A

The Design And Implement Of Two Efficient Multiple Pattern Matching Algorithms

Posted on:2016-03-19Degree:MasterType:Thesis
Country:ChinaCandidate:T LinFull Text:PDF
GTID:2348330488974409Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Pattern matching algorithms are one of the core technologies in the field of computer science, and have been widely used in network security, searching engine and biological computing. The performance of pattern matching algorithms can influence the whole performance of the network security system directly. With the rapid development of computer network technology, the data generated by the Internet grow explosively, which leads many new challenges to the research of pattern matching algorithms. The large-scale pattern sets greatly degrade the performance of pattern matching algorithms and become the bottleneck of the network security system. So most of the classical pattern matching algorithms can not be directly applied to the large scale pattern sets, and the research of pattern matching algorithms for large scale pattern sets is becoming more and more important.This paper proposed two efficient pattern matching algorithms :ELSM algorithm and SBT algorithm to deal with the large scale pattern sets.The main contributions are as follows:1. An efficient large-scale multi-pattern matching algorithm called ELSM algorithm is proposed. Firstly, this thesis discusses the properties of MASM algorithm which has low space complexity and is suitable for large scale pattern sets, as well as the characteristic of the pattern compression algorithm--Leaf-Attaching algorithm which is used in the preprocessing stage in MASM algorithm. Though the space complexity of MASM algorithm is low, it's time complexity is high and is proportional to the number of pattern sets. To reduce the time complexity, this thesis proposes the ELSM algorithm which is based on the framework of MASM. The ELSM algorithm has both low space complexity and time complexity. It uses the improved Leaf-Attaching algorithm in the preprocessing stage, which greatly reduces the memory consumption when compressing large scale pattern sets. In the matching stage, ELSM algorithm improves the MASM algorithm in three aspects:(1)Using array to implement complete binary search tree,which reduces the amout of memory footprint and speeds up the accessing of tree nodes.(2)Using the AC algorithm as a preliminary algorithm to filter out lots of unmatched positions in the text, which reduces the matching time.(3)Using the hash table to divide the whole binary search tree into smaller ones. This reduces the scale of the binary search tree, therefore the searching time in the tree can be reduced. Experimental results show that ELSM algorithm improves the matching performance of MASM algorithm while retains its advantages. These make ELSM algorithm more suitable for large scale pattern sets.2.An multi-pattern matching algorithm which is based on Burst Tries called SBT algorithm is proposed. Firstly this thesis introduces the Burst Tries data structure and its insertion and searching process which are used in the SBT algorithm. The SBT algorihtm uses two strategies to decrease the space consumption and improve matching efficiency of Burst Tries:(1)SBT algorithm reduces the memory consumption of Burst Tries by using a new technology of space compression.(2) By exploiting the result of the already matched part, the SBT algorithm can skip a lot of positions in the text string, which improves the efficiency greatly. A serie of experiments demonstrate the superiority of SBT over other algorithms.
Keywords/Search Tags:multi-pattern matching, high matching performance, MASM algorithm, ELSM algorithm, SBT algorithm
PDF Full Text Request
Related items