Font Size: a A A

Storage, Updating And Retrival Of XML Data In Relational Database Systems

Posted on:2004-05-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z C XuFull Text:PDF
GTID:1118360095462826Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
XML has been becoming the de facto standard for information publication and exchange on the Web, substituting for HTML. Comparing to HTML, XML is simple, self-describing, and the content, structure and representation of XML documents are independent, which makes XML more suitable for data representation and exchange on the INTERNET. Recently, XML has been widely used in various applications, and very large volumes of XML data have been appeared in the Web. To organize and manage XML data efficiently, different query languages and storage approaches have been proposed. As a variable and promising approach, using RDBMS to manage XML data is extensively studied in recent years. However, due to the differences between data models, RDBMS-based XML data processing brings up numerous challenges for traditional database techniques.This dissertation studies the issues on XML storage, updating and retrival in RDBMS. In particular, it focuses on the problems of adaptation of XML storage schema, normalized storage of XML, constraint-preserving XML updating, the efficient retrieval of XML data and the optimal path index selection of XML, etc. Various new algorithms and techniques are proposed and implemented in an XML-relational database system prototype. A large number of experiments are conducted and the experimental results show the effectiveness of those approaches proposed. The fruits of this dissertation can be used in the research and development of products on XML database. The contributions of this dissertation can be summarized as following:1) Proposes the techniques of XML storage schema adaptation. The efficiency of XML management system depends on its storage schema. Under the condition that user's queries are given or expected, designing storage schema based on user's queries can improve the efficiency of system significantly. Based on history queries, storage schema can be automatically adjusted for improving query-processing efficiency. Four kinds of schema adaptation strategies are given, in which two are used for automatic schema adaptation. Experimental results validate the practicability and effectiveness of the proposed approaches. The techniques can be integrated into the system optimizer.2) Based on XML keys which define the semantic constraints of XML document, we present a storing method, which stores XML documents while conservingtheir key constraints. And the normalized storage of XML in relational databases is implemented. Needless redundancy is eliminated, and abnormal operations are reduced. The methods proposed have some significance for processing XML in the future. It is the base for key constraint-preserving XML updating.3) Based on XML keys and the constraint-preserving normalized storage of XML over relational databases, we study the novel method for updating XML while preserving key constraints. By propagating XML keys to relations as functional dependencies, we update XML data and it's storage over relational databases at the same time, we preserve the coincidence between them. We give the annotation technology, which can be used to locate the positions of updates in the original one and update the documents efficiently. This updating technique fully evolves XML into a universal data representation and sharing format.4) We study the keyword search for XML in relational databases, present two new inverted list indexes: extended inverted index based on containment relationship and inverted index based on schema. The former reduces the space cost greatly by considering containment relationship between elements, and the latter reduces the space cost further by considering the schema of XML, at the same time it improves the efficiency of XML data search significantly. Experiments indicate that the inverted index based on schema gets the best trade-off between space cost and query efficiency. This result is useful for the design of future search engine for XML.5) We study a new index structure, Structural Map, for efficient evaluation of path expressi...
Keywords/Search Tags:Relational
PDF Full Text Request
Related items