On XML query processing

Posted on:2009-04-15

Degree:Ph.D

Type:Dissertation

University:Southern Illinois University at Carbondale

Candidate:Jiang, Zhewei

Full Text:PDF

GTID:1448390005950131

Subject:Computer Science

Abstract/Summary:

While XML (Extensible Markup Language) has been widely accepted as the standard for data storage, exchange, and integration over the Internet, The XML query processing becomes an interesting and challenging research topic in XML database research because of the rapidly emerging applications in XML query analysis and optimization. Finding all occurrences of the given query pattern is the core operation of XML query processing, which typically involves the tree-structured navigation and pattern matching.While currently proposed query processing algorithm, such as TwigStack, iTwigList and TJFast, provides an effective way to locate and retrieve the query matches, its time and space efficiency still needs more enhancement. The potentially large size of the intermediate results and the cumbersome path merging phase, which are two major deficiencies among the current algorithms, slow down the query estimation process. This kind of situation is further exacerbated in the online applications that require the evaluation process to be reported timely and flexibly.Our research improves XML query processing on two main aspects. First, we proposed a novel single-phase holistic query matching method, which elegantly avoids the generation and processing of intermediate results in traditional two-phase based algorithms. Without generating individual path matches as intermediate results, our method is able to avoid the storage and output/input of the individual path matches, and totally eliminate the potentially time-consuming merging operation. Experimental results demonstrate the applicability and advantages of our approach. On the other hand, we enhance the XML query processors with the ability to process queries progressively or incrementally and report partial or estimated results and progress of query evaluation continually. The methodology lays its foundation on sampling. We shed light on how effective samples can be drawn from semi-structured XML data, as opposed to flat-table relational data. Several innovative sampling schemes on XML data are designed. The progress of query processing is also reported continually. The proposed methodology advances XML query processing to the next level - being more flexible, responsive, user-informed, and user-controllable, to meet emerging needs and future challenges.

Keywords/Search Tags:

XML, Data

Related items

1	Seismic Achievement Data ETL Platform Architecture Design And Software System Implementation
2	The Research And Application Of Data Preprocessing In XML Data Warehouse
3	Research On Related Issues Of Unstructured Data
4	The Data Integration、analysis And Utilization For Hosiptal Information Based On The Data Warehouse
5	Design And Implementation Of Data Mining Support Subsystem Based On Big Data Of Power
6	Design And Implementation Of Environmental Monitoring Data Management System
7	Research On The Problems And Countermeasures Of Domestic Data Journalism Practice
8	Study On Data Dependency_Based Data Quality Processing Techniques In Data Integration
9	Big Data And Research Of Big Data In Modern Internet Applications
10	Design And Implementation Of The Bayonet Data Integration Platform