Font Size: a A A

Efficient XPath query processing in native XML databases

Posted on:2009-03-06Degree:Ph.DType:Thesis
University:The Chinese University of Hong Kong (Hong Kong)Candidate:Tang, NanFull Text:PDF
GTID:2448390002494487Subject:Computer Science
Abstract/Summary:
As XML (eXtensible Markup Language) becomes a universal medium for data exchange over the Internet, efficient XML query processing is now the focus of considerable research and development activities. This thesis describes works toward efficient XML query evaluation and optimization in native XML databases.;A XML query can be decomposed to a sequence of structural joins (e.g., parent/child and ancestor/descendant) and content joins. Thus, structural join optimization is a key to improving join-based evaluation. We optimize structural join with two orthogonal methods: partition-based method exploits the spatial specialities of XML encodings by projecting them on a plane; and location-based method improves structural join by accurately pruning all irrelevant nodes, which cannot produce results.;XML indexes are widely studied to evaluate XML queries and in particular to accelerate join-based approaches. Index-based approaches outperform join-based approaches (e.g., holistic twig join) if the queries match the index. Existing XML indexes can only support a small set of XML queries because of the varieties in XML query representations. A XML query may involve child-axis only, both child-axis and branches, or additional descendant-or-self-axis but only in the query root. We propose novel indexes to efficiently support a much wider range of XML queries (with /, //, [], *).;A general XML index can itself be sizable leading to low efficiency. To alleviate this predicament, frequently asked queries can be indexed by a database system. They are referred to as views. Answering queries using materialized views is always cheaper than evaluating over the base data. Traditional techniques solve this problem by considering only a single view. We approach this problem by exploiting the potential relationships of multiple views, which can be used together to answer a given query. Experiments show that significant performance gain can be achieved from multiple views.
Keywords/Search Tags:XML, Query, Efficient, Views
Related items