Font Size: a A A

Research On Compress And Storage Method Of Mass Time Series Data

Posted on:2017-10-05Degree:MasterType:Thesis
Country:ChinaCandidate:T C ZouFull Text:PDF
GTID:2348330503487054Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the background of big data, the time series data is generated from all aspects of production and life. Analysis of the rules and value of time series data is significant for understanding the trend and developing rules wh ich are significant for making scientific decisions. With the rapid growth of the amount of time series data, the efficiency and reliability of storage and retrieval become a prominent problem. The compression storage solution nowaday always used a extra index for speeding up query. And there are still some deficiencies in current storage solution, such as the low speed of storage and low utilization rate of computer resources. So the study of efficient storage of massive time series data is profoundly significant.In general, there is a timestamp in time series data. Besides, the amount of it is enormous, and it is always read less and stored more, seldom modified and the data fluctuate just a little if the time stamp is closed to each other. When time series data is queried the result is always limited to some period of time. And the average value or maximum value or minimum value in some period is likely to be queried. Depending on the characteristics of time series data, this dissertation designs a compression algorithm and a storage structure.With respect to time series data compression this dissertation puts forward a new algorithm with a high compression ratio called TODACs which is base on DACs algorithm. It is concerned about decompression efficiency and can dynamic optimize the compression effect. TODACs expanding data storage types of DACs include negative number and floating-point number. The improvment of gap code is added to improve the compression ratio by using the characteristics, which the data fluctuate just a little if the time stamp is closed to each other.In terms of time-series data storage structure, according to the query characteristics of time series data this dissertation using time slice block storage, within which the time slice data is sorted by the primary key. The organizational form of heap table type is used to fix the great amount of time series data storage problem. And the column storage is aimed at improving the capacity of disk input and output.This dissertation designed the compression ratio, compression speed and decompression speed experiments to verify high compression rate of TODACs. The facticity of TODACs which reduce the total time of fetching and storing data and reduce the fetch data delay, is confirmed by the experimental analysis.
Keywords/Search Tags:time series data, DACs, compression, storage structure
PDF Full Text Request
Related items