Font Size: a A A

Research On Wu-Manber Mutiple String Matching Algorithm

Posted on:2009-12-30Degree:MasterType:Thesis
Country:ChinaCandidate:D M MoFull Text:PDF
GTID:2178360248954302Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
String matching is an old classic and being widely studied research area in computer science. It's one of the crucial techniques in inform retrieve area and computer biology area. Nowadays the requirements of pattern algorithms change with each passing day at network era. The needs of real time process are growing. All of these mentioned above bring up new challenge to the original algorithms. It's necessary to make some improvements to the existing algorithms.This thesis concentrates on the efficient multiple patterns algorithm——Wu-Manber algorithm. The thesis produce an algorithm based onε-free subsuffix pattern; and then combined it with the existing technique to produce a much more efficient one; and finally produce an improved algorithm in large scale patterns.The main results are listed below:1. An improvement based on the£-free subsuffix of patterns: Realizing there are rooms to improve when some patterns have common subsuffix, the paper produces an improved Wu-Manber multiple patterns matching algorithm, based on theε-free subsuffix patterns. The algorithm collects patterns with common£-free subsuffix and then reduces the amount of comparisons during the matching process . The paper's experiments are based on documents from http ://www.sogou.com/lab/. Compare the new algorithm with the origin Wu-Manber algorithm and the algorithm introduced by, the results are that the new algorithm can effectively reduce the amount of comparisons and then work more efficiently. 2. An improvement based on combination techniques: Produce a newly modified Wu-Manber multiple patterns matching algorithm which make an improvement of algorithm produced by 1. When we classify patterns by theε-free subsuffix of the patterns in the next links , we drag those patterns having commonε-free subsuffix to the front of the links. And make full use of merits of algorithm produced by and the technique mentioned above. Tests show that the new algorithm can further reduce the amount of words matching to at least 4.6 %.3. An improvement algorithm in large scale patterns: Produces a modified Wu-Manber multiple patterns matching algorithm, baseed on the idea of the Wu-Manber algorithm. The algorithm replaces the Same_Subsuffix link used in 1 with two links: Left_Subsuffix and Right_Subsuffix. When it come to compare the characters in the Same_Subsuffix link, what we need to do is to compare them in one of the links mentioned above. And then reduce the amount of characters matched. The new algorithm works particularly well when the patterns scale is large.
Keywords/Search Tags:Wu-Manber Algorithm, Multiple Pattern Matching, Pattern Matching, String Matching, Information Retrieval
PDF Full Text Request
Related items