Font Size: a A A

Researches On Data Compression For Information Resource Building Of Digital Library

Posted on:2005-09-16Degree:MasterType:Thesis
Country:ChinaCandidate:N N CuiFull Text:PDF
GTID:2168360122986563Subject:Mining engineering
Abstract/Summary:PDF Full Text Request
Digital Library is the main information management mode for the next generation Internet. Data compression plays more and more important role in the information resource building of Digital Library. We study four topics of the data compression technologies in Digital Library: We analyze the lossless compression techniques. Text is a kind of very common resource in digital library, and lossless techniques play an important role in compressing text. Starting from the Shannon's entropy theory, we analyze the lossless compression algorithms, and implement arithmetic coding algorithm in C. In the experiments, we compare four different lossless compression algorithms by their performances such as compression rate, compression rate tendency with the length of data, stability, and complexity, using 35 groups data series with 4 different length. We study the context modeling in lossless image compression. Context modeling is a key point in compression algorithm designing. In digital library building, lossless compression is mainly used in compressing digitalization of classical book, art of draws and calligraphies. We analysis and make comparison among several frequently adopted context models in lossless compression algorithm for image. Image compression framework based on integral wavelet is proposed and implemented. The integral wavelet gains wide application, because it overcomes the shortcomings of discrete wavelet like unsuitable for high speed and complexity of hardware implementation. Firstly, lifting integral wavelet is analyzed, and the choice criterion is proposed. Secondly, EZW and SPIHT are introduced. Finally, a novel SPIHT code based on integral wavelet is proposed and implemented, and the experiment results are also given. We study compression of document image, and propose a region segmentation based document image compression scheme. This method takes advantage of layout properties of the document image, after segment the image is divided into text region and non-text region, then compress the text region with pattern matching & substitute method, and compress the non-text region with wavelet based methods. Experiment results shows that the recovered document image compressed with this method has a higher visual quality over the standard JPEG method under the same compression ratio.
Keywords/Search Tags:Digital Library, Data Compression, Lossless Compression, Context model, Wavelet transformation
PDF Full Text Request
Related items