Font Size: a A A

Based On Relational Databases, Xml Storage, Query And Reconstruction

Posted on:2011-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:S Q ZhouFull Text:PDF
GTID:2208360308966191Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Extensible Markup Language (XML) provides a convenient and effective data format for data transmission on the Internet. It is a self-describing markup language, which provides a uniform way to describe data as well as its logic relation. XML is so popular that it is considered as standard for data representation, integration and exchange on the Internet. Nowadays, rapid expansion of XML documents on the Internet brings a new research challenge: XML data management. Using RDBMSs to manage XML is one of top researches due to RDBMSs'mature technologies such as memory management, inquiry services, concurrency control, data recovery, access control and security. However, XML data is naturally far more complex than flat, two-dimensional relational data. It is difficult to find a lossless way to store XML documents to RDBMSs.The purpose of this thesis is to design and implement a common XML data management system based on RDBMSs, so that it can be effectively applied to e-commerce and other ereas.The RDBMs-based methods can generally be divided into three phases, our work is as follows accordingly:Schema mapping, a database schema is generated from a DTD. The shared inlining technique is firstly improved by increasing the DTD simplification rules. Then, a new DTD graph and inline DTD graph model is defined. The schema mapping algorithm, DTD2RSchema, takes these models as input, and output the relational schema andĪƒ-mapping corresponding to DTD.Document mapping, an XML document is shredded into relational tuples which is then inserted into the relational database whose schema is generated in the schema mapping phase. A kind of XML tree model is firstly defined to present XML document, then the document mapping algorithm which is called SAXDocMap uses an efficient top-down tree traversal approach for encoding nodes of the XML tree as well as shredding them into relational tuples.Query mapping, which thranslates an XML query into its relational equivalent and reconstruct XML subtrees rooted at matching nodes if needed. In path matching phrase, the algorithm PathMatching takes the unfolded DTD graph which manages all DTD circles as input and generates simple path expresses when there is recursion both in an XML query and in its underlying DTD. Then, all simple path expresses generated in path matching phrase are translated into relational equivalent by query translation algorithm Convert2SQL. When it comes to reconstruction of XML subtrees, the algorithm ReconXML is used to reconstruct the XML subtree from the structure-encoded sequence which is a list of XML tree nodes generated by the algorithm SESGen.All algorithms above have implemented in an XML storage&query prototype system named X2R. At the end of this thesis a variety of experiments based on MySQL are displayed. Results show that X2R can store XML documents without any loss, and has good scalability as well as efficient query performance.
Keywords/Search Tags:XML, RDBMS, mapping, storage, query
PDF Full Text Request
Related items