Font Size: a A A

Research Of String Matching Algorithm Optimization

Posted on:2007-09-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y B LiuFull Text:PDF
GTID:2178360185954133Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
String matching is a classic research area in computer science and plays an important role in different kinds of network security systems. Nowadays the classic string matching algorithms have been faced with great challenges because of the rapidly growing of information and the emergence of new application requirements in engineering.This thesis was dedicated to the research of large scale exact string matching and dynamic string matching. With respect to large scale exact string matching, two optimization strategies were proposed: compressing the memory space of automata to achieve a better cache performance and rearranging the states of automata to improve its locality. With respect to dynamic string matching, classic algorithms are extended to provide the ability of inserting and deleting a keyword.In this thesis, the following results have been achieved:1. Performance analysis of string matching algorithm: Through experimental and theoretical analysis, the linear correlation between the execution time and level 2 cache misses of algorithm was illustrated. This thesis analyzed the time complexity of the classic algorithm (eg. AC.Wu-Manber, SBOM) and proposed two optimization strategies: compressing the string matching automata's memory space and improving its locality.2. A string matching algorithm based on memory space compression: Based on the idea of compressing automata's memory space, this thesis proposed a strategy to compress SBOM's memory space, and proved that the improved algorithm is O(mr log|∑| mr) in space complexity. Experiments showed that this improvement not only effectively compressed the automata's memory space, but also greatly improved the searching speed.3. Memory layout optimization of string matching automata: Based on the idea of improving automata's locality, firstly, this thesis formulated the memory layout problem into a mathematical optimization problem. And then, linear programming method was employed to solve it. Memory layout optimization is instructive both in practice and theory.4. Extension of classic algorithms: This thesis extended the classic algorithms, provide the ability of inserting and deleting a keyword without reconstructing the whole automata. The inserting and deleting operation is approximately linear in time complexity.5. Universal string matching library: Based on the research of pioneers and the work of this thesis, a universal string matching library was build. The library is thread-safety and provide continuous stream searching. It can be used in different projects for further development.
Keywords/Search Tags:Pattern Matching, Exact Pattern Matching, Dynamic Pattern Matching, Space Compression, Memory Layout Optimization
PDF Full Text Request
Related items