Font Size: a A A

Time Series Data Parallel Compression Algorithm On GPU

Posted on:2022-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:C ChengFull Text:PDF
GTID:2518306575472284Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of various internet services and industrial internet of things,the scale of time-series data is constantly expanding,which brings tremendous pressure to the storage and management of time-series data in the database.In the actual production environment,the disk throughput rate gradually become the overall performance bottleneck in the traditional database.In order to ensure the performance of the database system,using memory to cache recent data with persistent disk storage have become a mainstream solution recently.However,with the continuous growth of data volume,the system memory overhead continues to increase,and the cluster needs to be expanded frequently,which results in an increment in the costs of system runtime and maintenance.As a common technique to save storage overhead,data compression can achieve a good compression ratio or throughput rate,but the time-series database always limits the data compression process for prioritizing read and write performance,and it cannot support the real-time compression and decompression for time series data.In order to solve the above shortcomings,this paper design a General Framework for Parallel Compression(GFPC)based on CUDA,which can realize parallel compression for time series data by using GPU and accelerate the process of data compression and decompression.GFPC is highly decoupled from the compression algorithm and can adapt to a variety of them.The GFPC architecture can accelerate the compression and decompression of time-series data through data parallelism.In addition,based on the data characteristics analysis results of various types of numerical sequences in time series data,we also present several lossless compression algorithms for time series data on the GFPC framework.Compared with the traditional compression schemes,both the compression ratio and the throughput rate of our algorithms are significantly improved.Experimental results show that 1)our scheme can compress the test dataset to 1/30 of the original size at most,2)our scheme can improve the compression ratio by up to 50%and achieve a throughput of 3.5 times compared to Gorilla,3)GFPC supports compression or decompression throughput of GB/s on a personal computer.
Keywords/Search Tags:Time series, Data compression, GPU, CUDA
PDF Full Text Request
Related items