Font Size: a A A

Application semantics driven query processing

Posted on:2007-02-24Degree:Ph.DType:Dissertation
University:University of California, Santa BarbaraCandidate:Li, HuagangFull Text:PDF
GTID:1448390005972486Subject:Computer Science
Abstract/Summary:
Traditionally, query processing in relational database systems depends primarily on syntactic analysis, which may not be able to generate efficient solutions for diverse database applications. Nevertheless, there are many database applications with semantic knowledge, which can be exploited to customize the query processing to achieve greater performance gains. In this dissertation, we study how to integrate the application semantics to improve query processing in three prevalent database applications. (i) Data warehouses with multi-append-only-trend data: Data warehouses maintain historical data for the discovery of trends to support decision making activities. In a data warehouse, multiple time-related attributes are usually used to describe data items, and their values tend to increase over time. This tendency is referred to as the multi-append-only-trend property. We show how taking advantage of this property can improve the performance of range aggregate queries, an essential approach for summarizing large data sets. (ii) XML databases with tree shaped data: XML has become the standard for data exchange across different enterprises due to its expressive tree structured data. An XPath expression, a fundamental building block of queries over XML databases often involves content and structure predicates. We focus on XPath expression queries with content range predicates and study how to build an efficient summarization index to facilitate query processing by exploiting the semantic relationship between the range attributes and the corresponding path structures of XML data. (iii) Streaming databases with punctuation data: Punctuation has been recently introduced as a new streaming semantics to address the stateful problem of relational operators when adapted to data stream processing systems. We consider how a given continuous join query (CJQ) can benefit from a set of available punctuation schemes. In other words, if one can identify that a CJQ requires unbounded storage, then this query can be flagged as unsafe and prevented from running. We provide sufficient and necessary conditions for checking whether a CJQ can be safely executed under a given set of punctuation schemes by introducing a novel graph construct, the punctuation graph. We show that the safety checking problem can be done in polynomial time based on this punctuation graph construct.
Keywords/Search Tags:Query processing, Data, Punctuation, Semantics, XML
Related items