Font Size: a A A

Research On Pattern Matching In Trees In Multi-dimensional Space

Posted on:2019-11-17Degree:MasterType:Thesis
Country:ChinaCandidate:W J GeFull Text:PDF
GTID:2428330548985958Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Pattern is one of the knowledge representations in computer science.In field of data mining,as a standard method,pattern is widely applied to represent the hidden knowledge mined from data set.Compared against the traditional style of pattern,patterns employed in data mining are more complex because they are usually used to represent complex knowledge schema,especially for knowledge mined from data in multi-dimensional space.For practical application,some operations like matching,mining,combination are often executed for patterns.But now there are no effective algorithms for this kind of pattern while there are already many effective matching algorithms for traditional ones.This thesis is devoted to the studies of pattern matching in trees defined in multi-dimensional space with a view to find a way to the practical application of tree pattern.The main works of this thesis are as follows:(1)Based on the observations to the characteristics of trees defined in multi-dimensional space,the plain multi-dimensional tree matching algorithm directly designed from traditional tree pattern matching algorithms is studied.The studies show that there are the probability difference among dimensions and the information entropy difference within dimensions.Then the new matching strategy and algorithm in multi-dimensional space are proposed.By considering the probability of successful matchings,the algorithm could find the unmatched nodes as soon as possible and could realize the early pruning.The experimental results show that the new algorithm is more efficient than the plain one,especially for trees with wildcards.(2)Our studies show that it is inefficient to directly use the algorithm proposed above to solve multi-pattern matching problem.Because there are commonly many repetitive substructures in a multiple tree pattern set and the pruning of repetitive matchings is the key to design multi-pattern matching algorithm.Therefore,a new multi-pattern matching algorithm of multi-dimensional tree is proposed.The algorithm could merge the repetitive substructures so that the most repetitive matchings are eliminated.The experimental results show that the proposed algorithm is more efficient than the plain one.
Keywords/Search Tags:pattern, multi-dimensional tree, pattern matching, multi-pattern matching, frequent pattern
PDF Full Text Request
Related items