An Adaptive Compression Algorithm Supporting Query

Posted on:2021-04-01

Degree:Master

Type:Thesis

Country:China

Candidate:C J Zhao

Full Text:PDF

GTID:2428330623469208

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the development of computer technology,the amount of data is exploding,data compression as a technique to reduce the data storage space,can effectively ease the pressure of storage and transmission,has been widely used in various systems.Simultaneously,the mining of valid information on massive data is a current research hotspot,and data mining and analysis usually require operations based on random access to the data and data queries.However,existing traditional data compression algorithms are usually designed only to eliminate data redundancy and do not support random access,and other retrieval operations on compressed data and efficient index structures are also usually designed only for raw data and cannot be applied to the compressed data.Therefore,this thesis proposes an adaptive lossless compression framework supporting data queries: segmenting the data according to pre-determined delimiters,estimating the probability distribution of the data using the context model,compressing the data by arithmetic coding,and finally storing the encoding results in chunks.Among them,the adaptive,contextual model improves the compression by training on cold data and updating the model periodically to obtain an accurate probability distribution.In terms of storage strategy,this thesis chunked the encoded data based on the arithmetic encoding feature to support sequential queries without decompression,with query types including exact,prefix,and range queries.And,for more efficient querying,the algorithm in this thesis provides a convenient interface to support the integration of mainstream index structures,and the query effect verification is performed using a b+ tree as an example.The experimental results show that the compression effect of the algorithm in this thesis reaches the level of mainstream compression algorithms,and the sequential query speed is 24.8 times that of LZO,the fastest algorithm in comparison algorithms.At the same time,this thesis implements the integration of b + tree index.On the compressed data,the index query speed is 172.5 times that of the sequential query.

Keywords/Search Tags:

lossless compression, arithmetic encoding, random access

PDF Full Text Request

Related items

1	Research And Applocation Of The Image Lossless Compression Technology
2	Research Of Lossless Audio Encoding Algorithm
3	The Research Of Lossless Compression Arithmetic Based On Integer Wavelet Transform's Still Image
4	Lossless Image Coding Technology
5	Design And Implementation Of On-Board Image Compression System With Interframe Lossless And Near-Lossless Method
6	Lossless Data Compression,Algorithms Comparisons And Implementation.
7	PCIe Data Compression Accelerator Card Based On FPGA
8	Research Of Audio File Compression Methods Based On Arithmetic Coding
9	FPGA Implementation Of A Lossless Image Compression Algorithm JPEG-LS
10	Design Of A Lossless JPEG Compression Circuit