Font Size: a A A

Native Xml Database To Store And Index Key Technologies

Posted on:2010-08-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:X WangFull Text:PDF
GTID:1118360302457734Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
XML has become the de facto standard for representing and exchanging data on the Web. It is currently an urgent research topic in the database community that how to effectively manage large repositories of XML document data. Traditional relational databases could not be capable of large-scale XML data management because of the differences between the two data models. A native XML database, centering on the XML data model, provides the storage scheme, index structures and query engine that are specifically tailored for XML, and can use a very natural way to manage XML documents, which overcomes the inherent defects of using traditional RDBMSs to deal with XML. The primary problem that is confronted in the development process of a native XML database is the redesign and reimplementation of the storage scheme and index structures that incarnate the XML data model to support efficient XML query processing and data update, which is the research work of this thesis.This thesis first presents a novel storage scheme, called XN-Store, for native XML databases. This scheme directly stores XML nodes as records into a paged file to build up the primary index of the native XML store and implement the persistent document object model, thus retaining the original tree structure of XML data. XN-Store not only reduces the storage space overhead of XML documents, but also achieves fast export and access operations of XML nodes. Moreover, as a general purpose native XML storage scheme, XN-Store supports the creation of various secondary indexes to improve the efficiency of XML query processing. Our experimental results show that XN-Store is a high performance storage scheme for native XML databases.Based on the XN-Store storage scheme, this thesis presents a basic framework of a set of index structures that are applicable to native XML databases, including XML structural indexes, XML value indexes and XML full-text indexes, which can meet the basic requirements for XML query processing. XML structural indexes are used to accelerate structural relationship constraints of an XML query, while XML value indexes and XML full-text indexes are used to accelerate content predicate constraints of an XML query. The structural summary index is an important kind of XML structural indexes. This thesis proposes a new XML structural summary index rs_index whose characteristic is storing steps of a label path as a key in reverse order, so that a path query with an initial "//"-axis can be converted into the efficient B+-trees prefix matching. With the support of the rs_index structural summary index, the query execution of a simple path expression can take full advantage of consecutive "parent-child" axes as the query context information to prune a large amount of unnecessary search space.This thesis then presents a generation algorithm of the reduced query tree. Based on the basic framework of the native XML index structures, the algorithm simplifies the query tree to reduce the number of query nodes, so that the overhead of structural join operations is effectively decreased. This algorithm also enables a unified approach to evaluate structural relationship constraints and content predicate constraints. The experiments show that this algorithm will improve the average evaluation efficiency of XPath path expressions by one order of magnitude.This thesis also proposes the update strategy of the XN-Store storage scheme and various types of index structures. The dynamic XML tree numbering scheme BSC uses the properties of binary fractional numbers to resolve the node insertion problems. XN-Store's update mechanism not only maintains the document order, but also restricts an update operation within one page to ensure the update efficiency. At the same time of XML data update, the updates of various types of XML indexes are automatically maintained by the system. In addition, for the validation of XML update, this thesis gives the storage format for DTDs in a native XML database and proposes a new incremental validation method for XML update.In this thesis, a series of experiments are conducted to compare and verify the performance of the XN-Store storage scheme and the XML index framework. The experiments show that, based on the proposed storage scheme and index structures, efficient XML query processing and XML data update can be achieved.
Keywords/Search Tags:native XML database, XML storage scheme, XML index structure, XML query processing, XML data update
PDF Full Text Request
Related items