Font Size: a A A

Research Of High-Performance String Matching Technology For Large Scale String Set

Posted on:2009-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:X LiFull Text:PDF
GTID:2178360245469411Subject:Computer applications
Abstract/Summary:PDF Full Text Request
The rapid developing of computer network and applications bring us a great of benefit, we can get information very quickly and easily. Nowadays, the Internet has become one indispensable part of our daily life. While enjoying the great benefit of internet, we have to endure persecute of intrusion, attack, virus, spam and all type of threat behaviors from the network. The network security is affecting the networks' normally running and continuously develop, so it becomes more and more important.Various techniques were introduced to protect different threat behaviors from network. It included firewall in early days, then IDS/IPS, Virus scan system, spam filter system, and now UTM system appears. The core and key technology of all these devices and systems is string matching.All these systems should configure a large amount of rules according to various threat behaviors which are growing rapidly. The number of rules is increasing from tens of thousands to hundreds of thousands and even more. This makes the large scale string matching technique become increasingly importance. Further more the real time processing in network security system require a high performance large scale string matching algorithm to adapt the increasingly rate of network. The purpose of this paper is to introduce high performance string matching and approximate matching algorithms for large scale string set.In this paper, we are doing a lot of research on classical multiple string matching algorithms, including the exact matching and approximate matching algorithms. First of all, we find out the bottle-neck of these classic algorithms. Then addressed a new large scale string matching algorithm-SRS algorithm (Shift-Relay-Shift algorithm), this algorithm has a good performance, even the number of the string set is up to 100 thousand, the matching speed still have 10 times higher than that of the classic algorithm. This paper also addressed a new high speed approximate string algorithm for large scale string set—SrsPex algorithm, the searching speed of this algorithm is up to 100 MB/s, when the approximate string set is 10 thousands, and the fastest speed is nearly 400 MB/s. Finally, we apply SRS algorithm to virus library of an open source anti-virus software named ClamAV.
Keywords/Search Tags:string matching, large scale string matching, large scale approximate string matching, SRS algorithm, SrsPex algorithm
PDF Full Text Request
Related items