Font Size: a A A

Research On Queryable XML Data Compression

Posted on:2015-05-02Degree:MasterType:Thesis
Country:ChinaCandidate:C C XuFull Text:PDF
GTID:2308330503475334Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the internet technology, more and more people start to exchange and get information through the Internet with each other. In fact, XML document has become the standard of the data exchange.XML has the characteristics such as self-describing,semi-structured form,extensibility,exchange ability and so on.However,there are much redundant information in XML document that increases the cost of data storage,query and transmission. How to compress XML document has become an important problem to solve.The compressed document can shorten the time to transfer XML data and reduce the shortage. However, to compress the XML document is not the only purpose for us. Now to query the compressed document has become our research target. If we want to execute the query to fully decompress the compressed,the document will pay a heavy price. To query the compressed XML document can reduce the burden of the system.Now there are many methods to compress XML document which support to directly query the compressed XML document.However,these means have many contradictions between the compressive property and query capability. We want to achieve Win Win which means we want to find a method to improve the capability of the compression and the query.So,we submit two methods to make it possible to query directly the compressed document.There are many duplicate paths in XML document. The first method in the paper merge the subtrees which have the relationship of sit-containment in XML. Through the operation,many duplicate paths have disappear. The compressed XML document will occupy less storage.And to a certain degree to compress data has been put into practice.The method just scan the XML document only once to complete the compression.The method has good compressive property and query capability by experiment.To better support complex Twig Query, we give the second method in the paper. The method use the prefix encoding to encode the nodes in XML tree which can judge the relationship between nodes. We do not store many labels in the method.We use paths in content data table to build the index which is similar to XML schema tree to accelerate the query. The method have the good advantage to realize the complex query in the compressed document.
Keywords/Search Tags:XML data, XML data compress, query, redundancy structure, Twig Query
PDF Full Text Request
Related items