Font Size: a A A

Research And Implementation On Efficient String Match For Ten Millions Of Patterns

Posted on:2013-10-12Degree:MasterType:Thesis
Country:ChinaCandidate:X L CaoFull Text:PDF
GTID:2268330392469334Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
String matching has always been a hot and difficult direction in computerscience. In the field of information security, rules is being more large size, theInternet traffic is increasing, so the string matching algorithm is becoming theperformance bottleneck of the network security system.This paper does some research on the exact matching algorithm of threeclassic multi-mode, in-depth summary on the advantages and disadvantages ofthese three algorithms. On the research of the existing rules set, this articlesummarizes the characteristics of thousands of model ensemble. To resolve tenmillion rule matching requirements, it designs ACVL and CDWH algorithm.ACVL algorithm is using AVL trie to optimize the memory usage of the ACalgorithm, and then the AC algorithm can support millions of mode of collectionrequirements, and also gives ACVL algorithm optimization, cut off the exist rules,state compression and path compression are three ways to further compress thememory. Finally, ACVL algorithm use2.5G memory to load ten million rules.Using the classification ideas, CDWH algorithm divides the rules into the longstring set and the short set. CDWH algorithm Uses DAT algorithm to match theshort set and WM algorithm to resolve the long set. To optimize the CDWHalgorithm, DAT algorithm compresses the memory and shorts the initializationtime. WM algorithm reduced Hash conflict and speed up the speed of searchingthe fully match.This paper used two ways to experience ACVL algorithm and CDWHalgorithms. Off-line experiment first tests the correctness of the algorithm, andthen tests the algorithm performance in the pattern set of different rules,including the initialization time, memory and scan time in three aspects. Theonline experiment tests the ACVL algorithm’s and CDWH algorithm’s scan speed.In the offline and online experience environment, ACVL algorithm and CDWHalgorithm were compared and found that CDWH algorithm is superior than theACVL algorithm, it can improve the system a21.4%performance.
Keywords/Search Tags:string matching, ten million rules, AC algorithm, Wu-Manberalgorithm, DAT algorithm, memory compression
PDF Full Text Request
Related items