Font Size: a A A

Semi-Approximate Pattern Matching Algorithm Based On BPM-BM

Posted on:2011-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y H LiuFull Text:PDF
GTID:2178330338981942Subject:Computer technology
Abstract/Summary:PDF Full Text Request
This paper want to solve one type of problem on pattern matching: some parts of the pattern should be matched exactly, while others should be matched approximately. We bring a concept of Semi-Approximate pattern for this kind of problem. This problem is practical in many fields, however there are few studies on it, especially for large text. We construct the layout of Semi-Approximate pattern, so we can translate the problem to regular approximate pattern matching. Then we studied the property of Semi-Approximate pattern, and analyzed the usages and functions of various types of characters in it. Next we explored several bit-parallelism and/or filtration approach on existing approximate pattern matching algorithm. After studied the special property of Semi-Approximate pattern, we remarkably improved the performance of BPM-BM algorithm, which combined BPM bit-parallelism and BM filtration and adaptable for large alphabet, and present a single Semi-Approximate pattern matching algorithm named BPM/SA-BM. Further, we extent BPM/SA-BM to MBPM/SA-BM, a multiple Semi-Approximate pattern matching algorithm, built on MBPM-BM, a multi-pattern matching algorithm derived from BPM-BM. We overcome several negative factors in algorithm performance when design the presented algorithm, such as large text, long pattern length, large difference permitted or difference ratio. We show that the resulting algorithms is optimal than previous algorithms at all situation. Moreover, we can even get more better performance by using the resulting algorithms for regular approximate pattern matching, with little limitation on the regular approximate pattern. Additionally, we also present a solution to cut the space overhead of MBPM-BM.
Keywords/Search Tags:"Semi-Approximate pattern", "Approximate pattern matching", "Multiple pattern matching", "Semantic based search", Bit-parallelism filtration, BPM-BM
PDF Full Text Request
Related items