Font Size: a A A

Algorithm Based On Chinese For Matching Multiple Patterns And Its Application Research

Posted on:2013-07-04Degree:MasterType:Thesis
Country:ChinaCandidate:G Y ZhangFull Text:PDF
GTID:2248330377960806Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Pattern matching is an important research direction of the computer-application field,and is widely used in the Network Security, the Information Retrieval, the BiomedicalComputing and so on. With the rapid development of the computer network technology, theinformation in network has exploded, and there are new requirements on network security.The performance of pattern matching algorithm has great significance to improve thenetwork security system.This dissertation describes the status and application of pattern matching technology,and introduces several classical pattern matching algorithms, including the BF algorithm,the KMP algorithm, the BM algorithm, the QS algorithm which are single-pattern matchingalgorithm and AC algorithm, AC_BM algorithm and WM algorithms which are multi-pattern matching algorithm. This dissertation analyzes the lack that the AC algorithm cannot jump to match and the problems of Chinese multi-pattern matching.According to the characteristics of Chinese character encoding variable length, thisdissertation puts forward the idea of a split match, in which the Chinese characters areprocessed in bytes. This method gets rid of the limitations of the different encoding formatand reduces the space in converting the mode string to pattern-tree greatly. According to thedisadvantage of AC, we proposes an improved algorithm, which is Chinese-orientedalgorithm—AC_BJT(AC Based on the Jump Table),in order to deal with the jumpmatching. The algorithm first establishes a jump table (jump) in the initialization phase, andqueries the characters which jump directly without matching characters based on the tableof jump in the matching phase.The ratio of the jumping value in the table is large, so thealgorithm efficiency is improved significantly.This dissertation also describes the design of the UTM gateway prototype system,achieved the UTM gateway content filtering module based on AC_BJT algorithm. Finallythis dissertation builds a test environment to have a stress testing of UTM gateway Webcontent filtering module. the experimental results show that the AC_BJT algorithm hasbetter time performance.
Keywords/Search Tags:Multi-pattern Matching, Content Filtering, Chinese Character, UTMGateway
PDF Full Text Request
Related items