Font Size: a A A

Research Of XML Query Algorithm Based On Twig Pattern Matching

Posted on:2015-05-24Degree:MasterType:Thesis
Country:ChinaCandidate:Q P ZhangFull Text:PDF
GTID:2298330422980964Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The XML technology has the advantages of self-descriptive characteristic, flexible datastructure and rich representation ability. So it’s widely used in the web applications. With thegrowth of XML data quantity, the twig pattern matching algorithm has become a hot problem forthe XML database researches. Coding scheme can determine the structure relationships betweennodes in the XML documents, so it can help us to greatly improve the efficiency of XML queryoperations. Since the early query algorithms of XML results in useless intermediate results,based on the PathStack algorithm and TwigStack algorithm, the improved algorithms are deeplyresearched in the paper.Firstly,Traditional twig pattern matching algorithms for XML document have theirdefects.For an example, the algorithms put all the nodes whose label name is the same with thatof pattern tree into memory, resulting in high space complexity. To improve the performance oftwig pattern matching, a novel algorithm without merging called ListFWM(List Filter WithoutMerging) is proposed in the paper. ListFWM algorithm uses the method of the region code andfilters useless intermediate nodes by the structural relationship between nodes. Experimentalresults show that the algorithm ListFWM improves performance of query processing comparedwith TwigStack. Secondly, in view of the traditional algorithm need to encode the wholedocument, resulting in the greatly reduced query efficiency,so a novel algorithm calledLocalPathStack is proposed,which is based on PathStack algorithm and is the core content ofTwigStack algorithm.On this basis, a novel algorithm called PathList applicable to XML streamdata is proposed, and according to the characteristics of the XML stream data processed, thisalgorithm was optimized by methods of filtering query root. Finally, based on the comparativeexperiments of documents with the different file size and of different query path length, it showsthat although PathList algorithm and its optimization algorithm’s time complexity is slightlybetter than LocalPathStack algorithm and its optimization algorithm, their space complexity isgreatly reduced.Ultimately a novel Twig pattern matching algorithm called StreamTwigListapplicable to XML stream is proposed, and simultaneously an application case of algorithm thepaper researched in the ammunition design software is given.
Keywords/Search Tags:XML, region encoding, DOM, path pattern, twig pattern, stream model
PDF Full Text Request
Related items