Font Size: a A A

Research On Database Query Optimization Techniques Based On XML

Posted on:2015-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y LuoFull Text:PDF
GTID:2268330428976599Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet and wide range application of e-commerce, XML has become one of the standard of data storage and exchange on Internet. Because of the characteristics of XML data, how to store, query and optimize XML data is still a topic worth in-depth research, while XML coding and structure connection of XML data are two major parts of the query optimization process of XML data.At present, a variety of XML data encoding can support the query of structure. This thesis analyzed the advantages and disadvantages of these coding schemes, and proposed a new code based on sequence of complete binary tree. The new code has simple form, using only an integer to express the code for a node. Utilizing the properties of a complete binary tree, it can quickly determine ancestor/descendant relationships. Because the new code needs to convert the XML document into a binary tree, in order to accommodate the child nodes of a node to the same level, the level of child nodes has to be moved down, which leads not to well support the determination of paternity and sibling relationship. For this, the thesis optimized the new coding scheme by using a tuple with two elements to represent information of a node, instead of an integer. Although this optimization will increase a few of storage costs, it can efficiently determine the parent-child relationships and sibling relationships. Experiments show the optimized new coding scheme has a good time performance while maintaining a simple encoding form.In addition, the thesis analyzed vertical partitioning algorithm of PBiTree coding, and the new method of dividing encoding space, and put forward an expanded structure join algorithm based on the new coding scheme. The main idea of the new algorithm is to divide the sets of ancestors and descendants into subsets such that the subsets can be processed simultaneously in the random-access memory (RAM), and to eliminate duplication of set partitioning. If any of the subsets is empty, since the result of structure connection must also be empty, will not to be processed. Finally, by joining the results of the subsets, the results of the entire connection can be generated. The novel algorithm does not require nodes of the sets to be ordered or indexed, thus reduces the time and space overhead. Experiments show that the new algorithm has high efficiency.
Keywords/Search Tags:XML, XML querying, XML encoding, Structure join, Complete binary tree
PDF Full Text Request
Related items