Font Size: a A A

Query Optimization Based On XML Index And Cache Technology

Posted on:2009-01-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:B ZhangFull Text:PDF
GTID:1118360272458844Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
XML was introduced with the development of Internet and it has been one of the standards for data exchange and representation on the Web. It is widely applied in many fields, and much research work has been done on it. Many great achievements are gained in research on XML data storage, XML query processing, XML index and etc. While current query engine for XML suffers from the problems of inefficient adaptation performance and query reusage, also lacks of supporing for answering a set of queires simtaneously.In this dissertation, we make a survey about XML indexing technologies. We introduce concepts about XML index, clarify typical method for constructing XML index, analyzes the problem of existing index, summarize the research problems about different kinds of XML index which inclodes text index, element index, path index and sequence index. We also summarize the background and development of different index, and give a discussion about the future and challenge about XML index. At the same time, we discuss the problems about XML caching technology, make an analysis about existing caching methods and pinpoint their advantages and disadvantages.With the problems in XML index and caching technology, we research the following topics: Create novel adaptive index for XML documents, introduce efficient XML index for answering multiple queries, and come up with an efficient cache system for client-server environment. We propose effective algorithms for each work, test their correctness and efficiency with different kinds of dataset and query workload. Our research results are valuable not only in theory but also in practical applications. The main contributions of the dissertation are as follows:1) Design an adaptive index with efficient adaptation and query performance. Adaptive index is the index that can adapt the index structure according to the query workload. The adaptation process is for improving the query performance of frequent queries. In this summarize, we propose a novel adaptive index, which is different from current index. Firstly, our adaptive index has an adaptation process with high performance. Our adaptive index has the adaptation granularity of a set of element nodes, while the adaptation granularity for existing adaptive index is element node. Secondly, our adaptive index can only trigger local adaptation by exploring the containment relation between queries. Thus we reduce the adaptation scope and avoid the adaptation operation on whole index. Finally we design efficient query algorithm, especially answering infrequent queries with frequent queries in the index.2) Design an efficient index for answering a set of queries simtaneously. Current index executes the query one by one. In client-server environment, several clients send queries to the server for execution and the server send back answer to the clients. There are same queries asked by different clients and common parts among queries. To execute these same queries or common parts literately will cause unnecessary operation at the server side. Another problem is that there are some no-result navigation in the query process, which can also increase the burden of the server side. In this paper we first create index for XML document, which can cluster the same paths in the document and also increase the distinction for filtering out no-result queries. Then we create index for a set of queries, by which we cluster the same queries and common parts among queries together. Based on the indexing method for both XML document and queries, we provide novel evaluation method that can execute a set of queries simultaneously. The evaluation process use the hash-based join method instead of the navigation process. Thus we can avoid redundant operation and filter out no-result queries. Furthermore we come up with a set of optimization method to boost the query performance of the index.3) Design an efficient cache system for XML query processing. Caching technology is one of the important methods for accelerating query process. In this paper, we design a novel caching system. We first provide more relaxed answerability criteria, which can improve the hit rate of the system. Then we propose efficient view selection and view evaluation methods based on the criteria. Our view selection process only need to scan the cache one time for finding a proper view from millions of views. And the view evaluation process is executed in an upward and downward manner with the help of a compact XML summary. Finally we provide a set of optimization methods for the caching system.In short, the research on XML indexing and the caching we proposed are very valuable for query optimization in XML database.
Keywords/Search Tags:XML index, XML cache, query process, query optimization
PDF Full Text Request
Related items