Font Size: a A A

On XML query processing

Posted on:2009-04-15Degree:Ph.DType:Dissertation
University:Southern Illinois University at CarbondaleCandidate:Jiang, ZheweiFull Text:PDF
GTID:1448390005950131Subject:Computer Science
Abstract/Summary:
While XML (Extensible Markup Language) has been widely accepted as the standard for data storage, exchange, and integration over the Internet, The XML query processing becomes an interesting and challenging research topic in XML database research because of the rapidly emerging applications in XML query analysis and optimization. Finding all occurrences of the given query pattern is the core operation of XML query processing, which typically involves the tree-structured navigation and pattern matching.While currently proposed query processing algorithm, such as TwigStack, iTwigList and TJFast, provides an effective way to locate and retrieve the query matches, its time and space efficiency still needs more enhancement. The potentially large size of the intermediate results and the cumbersome path merging phase, which are two major deficiencies among the current algorithms, slow down the query estimation process. This kind of situation is further exacerbated in the online applications that require the evaluation process to be reported timely and flexibly.Our research improves XML query processing on two main aspects. First, we proposed a novel single-phase holistic query matching method, which elegantly avoids the generation and processing of intermediate results in traditional two-phase based algorithms. Without generating individual path matches as intermediate results, our method is able to avoid the storage and output/input of the individual path matches, and totally eliminate the potentially time-consuming merging operation. Experimental results demonstrate the applicability and advantages of our approach. On the other hand, we enhance the XML query processors with the ability to process queries progressively or incrementally and report partial or estimated results and progress of query evaluation continually. The methodology lays its foundation on sampling. We shed light on how effective samples can be drawn from semi-structured XML data, as opposed to flat-table relational data. Several innovative sampling schemes on XML data are designed. The progress of query processing is also reported continually. The proposed methodology advances XML query processing to the next level - being more flexible, responsive, user-informed, and user-controllable, to meet emerging needs and future challenges.
Keywords/Search Tags:XML, Data
Related items