Font Size: a A A

The Study Of XML Document Storage Technique In Relation Database

Posted on:2004-06-28Degree:MasterType:Thesis
Country:ChinaCandidate:H YueFull Text:PDF
GTID:2168360095956634Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
XML(eXtensible Markup Language) is a new markup language for web application. It is established by W3C as a kind of in general use language norm in Feb.1998. In fact, XML is a kind of definition languages and the user can define his own tag to describe the data in XML documental. Accordingly it is more flexible than HTML. In the Web , many problems have been caused by the use of the large quantity XML data. One of them is how to store and manage the XML data efficiently. Storing XML document in relation database , we can make use of database maganement system to control the data availably. So storing XML document in relation database becomes a hotspot in the study of XML storage strategy. When storing XML document in relation database, it needs to split the semi-structured XML into rows and columns of relations. It is time consume and breaks the original structure of XML document. So it reduce the speed of data processing.In this thesis, we focus on the research of XML document stored in relation database. We introduce the XML related technique, and make a comparison to the storage methods of XML document.We analyse the basic semantic information of XML Schema , bring forward the B_Schema( Basic XML Schema) concept, give a storage method of XML document based on B_Schema.B_Schema is a equivalent expression of XML Schema. B_Schema can be directly mapped into relation schema. The primitive XML Schema must be converted into the equivalent B_Schema . we utilize the tree of DOM to describe the B_Schema. And we put forward the classified node which can be directly mapped into relation. The cost model combines the statistical of B_Schema to estimate the storage cost . We bring forward a set of rewritting rules of the B_schema and these rewrittimg rules focus on two ascepts: inline rewritting and choice rewritting. We introduce a search algorithm to find a B_Schema in the B_Schema set whose cost is minimum. The search algorithm makes use of rewritting rules to get a set of equivalent B_Schemas , utilize the cost model to estimate the cost of each B_Schema and iterative search the min-cost B_Schema. At the same time, to speed up searching , we lead into the parameter of cost optimization as the end condition.
Keywords/Search Tags:XML, B_Schema, Cost Model, Search algorithm
PDF Full Text Request
Related items