Font Size: a A A

Research On High Efficiency Update-supporting XML Data Encoding Methods

Posted on:2012-10-07Degree:MasterType:Thesis
Country:ChinaCandidate:J HuangFull Text:PDF
GTID:2178330332467391Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of computer and network technology, XML technology has been extended continuously and it has become the data exchange standards and the foundation of SOA-Architecture in fact. As a special semi-structured data, XML is different from the data in relational data model. XML database is a collection of XML data in which the data is persistent and operable. How to store and query XML data effectively is an important aspect of XML database, which will affect the efficiency of the system. The storage and query based on some special encoding method has become one of the hot issues in the data research field because of its wide applications.Interzone-based encoding method and path-based encoding method are the two types encoding methods for the XML document tree. XML document tree needs to be updated dynamically according to the application requirements and the current XML data encoding methods can not support it well. This paper carried out relevant research.After comparing several popular current data coding methods, this paper proposed a new efficient XML data encoding method which supported dynamic updating. The encoding method can determine the relationship between any two nodes and compute the level of a node in the XML document tree quickly and accurately, and keep the reencoding rate to zero when inserting nodes to the XML document tree.The main work is as follows:(1) By comparing several popular encoding method, this paper pointed out the deficiencies and the limitation of those methods in supporting dynamic update of XML document tree.(2) This paper proposed a new efficient encoding method which supported dynamic updating. The new method was an extension and deformation of the prefix encoding method and parity encoding method. The node's code of the encoding scheme used in the new method was consisted of the following three parts: prefix-code, parity-code and order-code. The current node's prefix-code was the string of its father's code with "." being removed. The nodes'codes were the same if they had the same father node. The prefix-code played an important role in determining paternity, brotherhood and ancestor/descendant relation between nodes. It had the advantages of the prefix code; parity-code had the initial value of "1". The encoding rule of parity-code was divided into high-frequency updating and vast-number updating when inserting nodes into the XML document tree. In order to distinguish them, the interval factor a was introduced. The interval factor was used to support the efficient dynamic updating of XML documents. The order-code was composed of letters. The order-code in static encoding algorithm indicated the order of sibling nodes in the XML document. The order-code in dynamic encoding algorithm was used to distinguish the nodes when they had the same prefix-code and parity-code, as well to indicate the order of sibling nodes.(3) The related definitions and algorithms for the proposed encoding scheme are explained in detail. The characteristics of the encoding method were discussed thoroughly. The encoding updating algorithm was proposed, and the encoding update details after inserting nodes are explained by examples.(4) Combined with the proposed encoding scheme, a more efficient XML data storage strategy was proposed. The specific XML data query process for the storage mode was explained based on the proposed encoding method.(5) Through experiments, the encoding method proposed in this paper was compared with the existent encoding methods in the following aspects:time performance, space performance, the reencoding rate and query performance.
Keywords/Search Tags:XML storage, XML query, XML encoding, dynamic updating, the reencoding rate
PDF Full Text Request
Related items