Font Size: a A A

Approximate Pattern Matching With Non-overlapping Constraints

Posted on:2016-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:S S LiFull Text:PDF
GTID:2348330536987053Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Pattern matching(also called string matching)is one of the essential problems in computer science.Pattern matching has been widely used for biological research,music information retrieval,sequential pattern mining and other fields.One of the cases is called loose matching,which means only considering the matching position of the last pattern substring in the sequence.One more challenging problem is considering the matching positions of each character in the sequence,called strict pattern matching which is one of the essential tasks of sequential patterns mining with gap constraints.The matching result of approximate pattern matching can exist for the sequence character does not match with the pattern substring.Approximate pattern matching provides the better matching flexibility.In this paper,we study the approximate pattern matching with Hamming distance constraints.Non-overlapping constraints requires using the different characters in the same location for any of the two occurrence.A non-overlapping set is a subset of all the occurrences of pattern P in sequence S and any two occurrences in the set are non-overlapping.The problem of pattern matching under the non-overlapping condition refers to finding the maximum non-overlapping set C,i.e.a non-overlapping set with the largest number of occurrences.The present study of pattern matching is exact match more.Exact matching requires each character must match,to some extent,exact matching limit the flexibility of the match.Therefore,we address Approximate Pattern matching with Non-Overlapping constraints,named APNO(Approximate Pattern matching with Non-Overlapping constraints).We use nettree structure to solve the problem and we propose NETLOR and NETROL.We also propose NETLORE and NETROLE by improving NETLOR and NETROL.The algorithm transforms the pattern matching problem into a Nettree and iterates to find the root-leaf-path,to prune the useless nodes in the Nettree after removing the root-leaf-path.Extensive experimental results not only validate the correctness of the four algorithms proposed in this paper,but also suggest NETLORE is more efficient than others.
Keywords/Search Tags:Pattern Mining, Non-Overlapping, Approximate, Nettree
PDF Full Text Request
Related items