Font Size: a A A

Research On Query Processing For XML Twig Pattern Based On Ordered Pair

Posted on:2011-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:R WangFull Text:PDF
GTID:2178360305995672Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of Internet, Semi-structured data are increasingly important in information exchange. How to accurately and efficiently query XML data has become a hot issue. XML documents can use a nested document tree to represent and the query path can be expressed as a twig pattern tree. Therefore, the query XML data from the XML document tree is to find out all the XML data fragments to meet the twig pattern. This process is called the twig pattern query.In recent years, researchers made a lot of inquiries twig pattern matching algorithms:TwigStack algorithm and the recently proposed TwigList and TwigNM algorithm. Twig pattern contains parent-child edges and ancestor-descendant edges. These algorithms are very effective when the twig pattern contains only ancestor-descendant edges, but when the twig pattern contains only parent-child edges or both ancestor-descendant edges and parent-child edges, these algorithms may still generate a lot of intermediate results, in particular, the size of the input and output is large.Aiming at shortcomings of present algorithm, by combining ViST algorithm idea using string matching query does not require the structural join, and Twig2Stack algorithm idea using bottom-up and without the merger, in this paper, two kinds of twig pattern matching algorithms PCTwig and OPTwig based on ordered pair. The main contributions are summarized as follows:(1) A new idea based on ordered pair is proposed, by the establishment of ordered pair node and node can link better. Matching query by ordered pair in the query tree and document tree.(2) Aiming at three nodes of twig pattern:root node, intermediate nodes and leaf nodes, three different matching methods are proposed. Also under the two kinds of relationships between nodes in the twig pattern:parent-child relationship and ancestor-descendant relationship, MatchPC and MatchAD function is constructed.(3) Two new algorithms PCTwig and OPTwig are proposed, and document tree and query tree structures for the storage are required. Query tree bottom-up storage, tagging when confronted with the branch node. This query process can be judged on the branch in order to avoid useless node generation.(4) Two new algorithm with the classical algorithm TwigStack and TwigStackList algorithm are compared in my experimental system. The experimental results show that PCTwig and OPTwig algorithm are efficient.
Keywords/Search Tags:XML, Twig pattern matching, Ordered pair, Parent-child relationship, Ancestor-descendant relationship
PDF Full Text Request
Related items