Font Size: a A A

Research On Query Processing Technology For XML Data Based On HoListic Twig Pattern

Posted on:2009-04-05Degree:MasterType:Thesis
Country:ChinaCandidate:L Z FuFull Text:PDF
GTID:2178360272463521Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As an actual standard of Internet data exchange and representation, XML is being widely accepted and adopted. With its ever-growing popularity, domestic and foreign scholars pay more and more attentions to XML data management and query processing. In XML database, twig pattern matching is the core of XML query processing and is important for improving the query efficiency.In recent years, many twig pattern matching algorithms have been proposed, such as TwigStack, TJFast, TwigList and Twig2Stack. These algorithms are efficient for queries only with ancestor-descendant relationship, but are unefficent for queries with parent-child relationship and a few return nodes.After deeply studing the remaining difficult issues in XML query processing, two based on twig pattern efficient algorithms TwigNM and TwigPC are proposed to resolve these difficult problems. The main contributions are summarized as follows:1. Aiming at the query with a few return nodes:(1). In this thesis, a new twig pattern is proposed, which is more general and meets more standard query languages than the twig pattern proposed in [15].(2). In this thesis, non-merging TwigNM and TwigNME are proposed, which are efficient twig pattern matching algorithms. Using simple Stack, they can skip a large number of non-extension nodes, avoid a lot of intermediate results, need not merge Path matching results, and deal with the query with return node efficiently. But the algorithm efficiency will be greatly affected by parent-child relationship.(3). In this thesis. TwigNM. TwigNME, TwigStack and TwigList are realized in my experimental system TwigNM. The experimental results show that TwigNM and TwigNME are much more efficient than typical algorithm TwigStack and the latest algorithm TwigList for the querie only with ancestor-descendant relationship.2. Aiming at the query with parent-child relationship:(1). To further improve the efficiency of processing the query with parent-child relationship, Zhang Encoding is extended in this thesis.(2). A new data structure List-Stack is proposed in this thesis. It is composed of List and Stack, and it is simple and easy to operate. (3). In this thesis, non merging TwigPC is proposed. Using simple List-Stack and Extended-List, it can avoid a lot of intermediate results, need not merge Path matching results, and can process the query with parent-child relationship efficiently. In addition, the algorithm can process the query with a few return nodes.(4). In this thesis, TwigNM, TwigNME, TwigStack and TwigList are realized in my experimental system TwigNM. The experimental results show that TwigPC are much more efficient than TwigStackList and TwigList for queries with parent-child relationship.
Keywords/Search Tags:XML document, query processing, hoListic twig pattern, twig pattern matching, parent-child relationship
PDF Full Text Request
Related items