Based On The Serialized Efficient Xml Query Algorithm

Posted on:2009-12-06

Degree:Master

Type:Thesis

Country:China

Candidate:Y K Huang

Full Text:PDF

GTID:2208360272958964

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the adoption of XML in a wide range of applications, XML query processing has become an important research topic. Evaluating twig queries efficiently is at the core of structured query processing. To evaluate a twig query, most current approaches disassemble the query into multiple root-to-leaf simple paths; with the assistance of some indexing structures, these simple path queries are evaluated independently; finally, the results of these sub queries are joined to generate the final answer. However, joining intermediate results is an expensive operation. Indeed, it has become the most significant cost in evaluating twig queries.Eliminating join operations dramatically reduces the cost of query evaluation. To avoid expensive joins, sequence-based methods convert both XML documents and queries into sequences, and answer twig queries as a whole through subsequence matching. However, due to the incompatibility between structured data and sequential data, sequence-based methods are under optimized for performance, as their first priority is given to ensuring representation and query equivalences.In this paper, we remove the obstacle hindering the true potential of sequence-based methods. We introduce a novel sequence-based approach, LEO, for XML query processing. In this approach, we present a flexible XML sequencing strategy, this strategy enables us to choose a sequencing method that maximizes the sharing of sequences to maintain a minimal index structure. With the help of a super tree, we can enhance the range label method to get unified 3-tuple labels for document encoding. This is the requirement of repensentation equivalence and the fundation of sequencing flexibility. We also present a flexible query processing method. It performs subsequence matching based on the selectivity of the elements in the sequence. With the data statistics, our algorithm picks the optimal matching order to evaluate a query request. Thus, the sequenced-based approach introduced in this paper can fully and independently optimize both index size and query response time.We implement the algorithms in LEO and conduct extensive experiments. The study of extensive experiments verify the efficiency of this approach. By comparing with both sequence-based and non-sequence-based approaches, the results show that our approach gain the improvements on both the index size and the query response time, and our method has higher query efficiency and substantially lower storage cost than previous approaches.

Keywords/Search Tags:

XML Query Processing, Index Technology, Sequence-based Approach

PDF Full Text Request

Related items

1	Research And Implementation Of XML Query Processing Technology
2	Research On Optimization Of Max-Score Query Processing Technology
3	Research And Implementation Of Spatial Text Data Query Processing Technology
4	Index Compression and Efficient Query Processing in Large Web Search Engines
5	Research On The Key Techniques For XML Index And Query
6	Index Compression And Query Processing In Search Engines
7	Research And Implementation Of Query Technology For SVG
8	Research On The Index Technology Of Semi-structured Data
9	Research On Sequence-based Indexing And Query Processing Technology For Uncertain XML
10	Research On Multidimensional Top-k Query Processing In IoT Sensing Networks