Font Size: a A A

Research On Query Processing Technology For XML Data Based On Holistic Twig Pattern

Posted on:2012-05-24Degree:MasterType:Thesis
Country:ChinaCandidate:W Y WuFull Text:PDF
GTID:2218330338951650Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Since the appearance of the eXtensible Markup Language (XML) in 1998, this particular semi-structured markup language has been widely used in many fields, and now, XML has emerged as a standard data exchange format over the Internet. With the increasing popularity of XML, the effective management of XML, such as the storage management mechanism, encoding, query processing, query optimization and indexes, have attracted the attention of many scholars around the world. Query processing over XML data has become an important issue of XML data management.In order to query XML data efficiently, a variety of XML encoding and query matching algorithms have been proposed. For query matching of XML, the existing methods can been divided into two categories. One is containment join, the other is the holistic twig pattern matching. The containment join decomposed the twig pattern into a series of sub-trees which is composed of two nodes connected with a P-C or A-D edge, through the match of all the simple sub-trees, and then the immediate results are connected to achieve the query of the twig pattern. For the holistic twig pattern matching, the twig pattern is treated as a whole, so it can greatly reduce the numbers of intermediate results. In this paper, we focus on the twig pattern query processing algorithms.Considering the fact that the existing query processing algorithms always generate a lot of useless intermediate results, we propose a new twig pattern matching algorithm (called PSBDirect), which is based on the PSB encoding and makes full use of the superior characteristic of the prefix and prime encoding to quickly determine the position relationship between nodes. The PSBDirect prune those nodes which will not participate in the final results in traversing nodes, so it can distinctly improve the efficiency of twig pattern query processing.In addition, XML database systems always need to execute many twig pattern queries at the same time, and these queries possibly exist some similarities. Motivated by this, some scholars have done some research on the multiple XML twig pattern queries. In this paper, through the reconstruction of the multiple twig patterns, we improve the TJFast algorithm to achieve the multiple twig pattern queries at the same time. The improved algorithm can greatly reduce the time of traversing the XML documents, and it also improves the query efficiency.
Keywords/Search Tags:XML, Twig Pattern, Query Processing
PDF Full Text Request
Related items