Font Size: a A A

Research And Implementation On XML Compression Technology

Posted on:2007-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y K WeiFull Text:PDF
GTID:2178360242961947Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Despite of disputation about its certain technical standards, XML is being used in variety of aspects of the Internet. For instance, it can be used to exchange data, supercede conventional Electronic Data Interchange, integrate various kinds of data sources, present diverse formats of data information and so forth. Because of its self-describing and platform-independent characteristic, XML is extremely fitting in the use of data exchange and integration of diverse formats of data sources. Nevertheless, the irregularity and redundancy of the data in XML format cause the waste of the disk space and the network band width. As a result, the compression of XML data is essential for the storage and data exchange for XML document.Based on the conventional data compression algorithm, considered the structure feature of the XML document, we propose an effective XML compressor for data exchange and data storage which is named XCfde. In order to obtain relatively high compression ratio, XCfde adopts four unique schemes. First of all, XCfde separates XML document into structure and content information using the XML parser by overloading the SAX interface and encodes the structure information by dictionary encoding. Then XCfde will recognize data types such as integer and float number etc. automatically with data type classification engine and assign the content information into different data container according to their data type and tag path. Next it will encode each data container consisting of content information with different types elementarily by its unique strategy evolved from classical encode method. Finally, it compresses the elementarily encoded structure and content data together with 7Zip. The critical technologies about XCfde which include data type classification engine and an improved Chinese text data compression algorithmn are described in this paper.The compression ratio experiment over several typical XML documents and the transmition performance evaluation experiment about XCfde in data exchange are carried out. The result of the compression ratio experiment demostrates that XCfde clearly achieves better compression ratio than not only general compression tools but also some other XML oriented compression tools. The result of latter experiment demonstrates that XCfde speeds up the data exchange in XML format and improves the utilization efficiency of network band width and disk space.
Keywords/Search Tags:Data Exchange, Data Compression, eXtensible Markup Language Compression, Sliding Window Compression
PDF Full Text Request
Related items