Font Size: a A A

The Research And Analysis Of The Single Pattern Matching Algorithm Based On Character Comparison

Posted on:2017-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:L LiFull Text:PDF
GTID:2348330512962132Subject:Software engineering
Abstract/Summary:PDF Full Text Request
KMP algorithm and BM algorithm are classical single pattern matching algorithms. In KMP, the pointer in the text can only move one character at one time, the moving distance of the pattern pointer j is less than the current value of j at each time, hence, the comparison is not efficient. As for BM algorithm, the current BM algorithm and its variant did not consider the relationship of precursor and successor between good suffix and bad character rules, thus their overall matching efficiency is not high enough. There are many BM variant algorithms at present, QS algorithm and F.1S algorithm are the most well-known improved algorithms of it, however, there is still much room for improvement. To the above three problems, we propose three novel single pattern matching algorithms based on character comparison.Firstly, the KMPP, an improved algorithm based on the KMP, is proposed. It combines the advantages of the KMP algorithm with BM algorithm together. When a mismatch occurs at pattern position j, the pattern moves to right for the distance of the value of next[j]. If at this time, the last character of pattern does not match with the corresponding text position, then the bad character rule is used. The moving distance of pointer i in the text is a combination of the above two-steps jumping distance, thus the pointer can move farthest at each time. It is an efficient pattern matching algorithm.Secondly, the QSP algorithm, an improvement to the QS algorithm, is proposed. The core idea of QSP algorithm is to find all characters that appear more than once in the pattern string from left to right, and obtain the maxPos value and two arrays in the preprocessing phase. During the matching phase, in order to move the pointer farthest at each time, it moves to right by using the two arrays smartly. Because the preprocessing of the pattern string before the character comparisons, the unnecessary character comparisons in the matching phase are avoided, thus the matching speed is improved greatly.Finally, the third new algorithm is FQSP algorithm which is based on FQS algorithm. The FQS algorithm is an improved algorithm of QS algorithm. In order to move farthest, it gets the value of pos from FQS algorithm, and considers the character of P[pos] with the next character in the pattern. Its efficiency is higher than the FQS algorithm.In the three proposed novel algorithms, the datasets of "gene.txt"?"bible.txt" and "100M.txt" are used in experimental sections. The experimental results are compared with the state-of-the-art algorithms. The efficiency of these three algorithms is presented.
Keywords/Search Tags:Character Comparison, Single Pattern Matching, KMPP Algorithm, QSP Algorithm, FQSP Algorithm
PDF Full Text Request
Related items