Font Size: a A A

The Research And Implementation Of Bit-parallel Based Dictionary Searching

Posted on:2017-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:C D YeFull Text:PDF
GTID:2308330482489990Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
String matching also known as pattern matching, and it is one of the important research directions in computer science. String matching has been widely used in text retrieval, music retrieval, data compression and DNA sequence detection. With the rapid development of computer network, it has produced a series of data search aspects of application requirements. Among some of the problems in network security, string matching plays a key role. This topic based on the bit-parallel technology to research the multi-mode matching Aho-Corasick automaton(AC automaton).Dictionary tree is a kind of data structure which is widely used in pattern matching. Bit-parallel technology is a powerful new technology in the design and implementation of the algorithm in recent years. Bit-parallel technique is combined with the classical string matching algorithm and the data structure, which gives a new impetus to the research of pattern matching. Recently, Cantone and Faro proposed the idea of multi pattern string matching algorithm based on bit-parallel technology is a kind of innovation. The research on Cantone and Faro proposed using bit-parallel technology to simulate the non deterministic finite state AC automaton, this kind of automaton is based on the dictionary tree(trie) to build, so that to achieve high performance pattern matching. However, their work focuses on the theoretical analysis, and the key research content of this topic is based on their algorithm, and design and implementation of high performance algorithms, and the performance of the algorithm is tested by experiment. The performance of bit-parallel algorithm is studied by experiment, and this will provide a valuable reference for the follow-up study. The main contents of this paper include:First of all, the research background and significance of this topic is briefly discussed, and a brief review of domestic and foreign research scholars in the field of string matching to do research work, and combined with the problems encountered in algorithm implementation, and points out the deficiency of the current pattern matching algorithm. This paper summarizes the existing various excellent and classic and has a good performance of string matching algorithm in practice, and the classification of the system is briefly carried out. A review of important bit-parallel ideas in the field of computer is described, and the study used in this study is a non deterministic finite automata. In this paper, we will introduce how to use bit-parallel technology to simulate the non deterministic finite automata. Of course, it will give some of the concepts and definitions that are unified before the work is carried out.Secondly, the structure of tire tree is introduced, while based on previous theoretical bedding and related preparations, the theory of pattern matching algorithm combining non deterministic AC automata and bit-parallel technology is described. At last, bit-parallel algorithm for multi pattern matching problem is introduced. Subsequently, the paper describes how to fast index the "1" in a word.Finally, the theory of the combination of bit-parallel technology and AC automata is realized by using C language, and through experiments with other excellent string matching algorithm for a performance comparison. Experiments show that the algorithm has certain advantages in the experiment by using the bit-parallel algorithm.
Keywords/Search Tags:String matching, Bit-parallel, Trie, Suffix NFA, Aho-Corasick automaton
PDF Full Text Request
Related items