Font Size: a A A

The role of structural aggregation for query processing over XML data

Posted on:2008-10-04Degree:Ph.DType:Dissertation
University:University of California, RiversideCandidate:Moro, Mirella MouraFull Text:PDF
GTID:1448390005458923Subject:Computer Science
Abstract/Summary:
With the advent of XML as the basis for many data-centric applications, issues regarding the effective retrieval of XML data have become prevalent. In this context, XML query evaluation presents unique challenges mainly because existing relational query algorithms cannot be directly applied to process XML data for diverse reasons: XML data conform to a tree-format rather than a tabular one, do not follow a strict schema, and are typically textual with repetitive information. A number of data structures—known as structural summaries—have been defined to compensate for the XML data repetition and lack of schema. So far, these summaries have been explored mainly as secondary indexes that can identify nodes reachable from specific path patterns. This dissertation shows that such summaries can also indicate new data clustering and partitioning policies that are very beneficial for XML processing. Even though this aspect has started to receive some attention, there is yet to exist a comprehensive study on using summaries as data clustering technique and on their partitioning properties with respect to XML query processing. Furthermore, various questions regarding the structural summaries behavior when processing both stored data and streams of data are still open.;Therefore, this dissertation examines query processing over XML data by exploring and extending the role of the structural aggregation properties provided by the summaries. Specifically, it evaluates and proposes algorithms for processing path queries over the partitions defined by the summaries. It introduces how the summaries can be employed as access methods and discusses the advantages and drawbacks of such context. It considers the typical query evaluation scenario of processing stored documents and returning the document nodes that satisfy a query (XPath semantics). Finally, it takes the role of structural aggregation one step further and introduces how the summaries can improve the performance of stream processing within the context of XML filtering. The overall objective is to show that structural aggregation methods can be employed efficiently in a variety of scenarios that are way more complex than the traditional secondary path indexing.
Keywords/Search Tags:XML, Structural aggregation, Processing, Query, Role, Over, Summaries
Related items