Font Size: a A A

Research On Structural Join Oriented XML Data Compression

Posted on:2014-03-28Degree:MasterType:Thesis
Country:ChinaCandidate:X L WeiFull Text:PDF
GTID:2298330452962718Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the internet technology, people can exchange informationthrough the Internet with each other. XML has become an important standard of informationrepresentation and exchange, which makes that more and more Web data are expressed inXML format. XML has been a widely used because its scalability and cross-platform nature.However, one of the biggest flaws of the XML is that it has a large redundancy, especiallywhen there is a large number of duplicate of structural information, which caused a waste ofbandwidth and disk space as well as the reduction of the query efficient. So the compressionof XML data becomes a very important job. But only consider how to compress XML data aslarge as possible does not completely solve the problem, how to perform a query on thecompressed data quickly and efficiently is also an important issue; in addition, structural joinis the core operation of XML data query, some complex queries, such as Twig Query,uncertainties query and so on are based on it. Therefore, the compression method whichsupports effective join operation has certain research significance.We carried out a research in-depth on the XML data compression and query techniques.We proposed a Structural Join Oriented XML Data Compression method named SJXC againstthat the existing XML data compression method cannot perform structural join operation onthe compressed data. This method, encodes each node with region encoding in the XML first,make it possible to perform structural join. Then, compress the structural data with dictionaryencoding; merge the same subtree to achieve effective compression of XML document treestructure. In order to make a better compression ratio, compress the centrally stored codedvalues which are attributed to the operation of merge the same subtree. Finally, we proposeour storage structure for the compressed data and research a method of query processing onthe storage model. This paper gives out the architecture of the structural join oriented XMLdata compression method, its critical processes and the related algorithms. In order to verify the effectiveness of SJXC, we carried out a large number ofexperiments on the typical datasets, such as XMark、DBLP、Shakespeare and so on. We alsoresearch and analysis the compression ratio, the compression/decompression time and thequery performance of different methods. The results show that, SJXC has a bettercompression and query performance.
Keywords/Search Tags:XML, data compression, structural join, the same subtree
PDF Full Text Request
Related items