Font Size: a A A

Multi-version Compression Upon Array Data Model

Posted on:2020-09-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2428330596468181Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of technology,there has been rapid upgrade in data collection,monitoring and processing.In some scientific disciplines such as astronomy,data from regular acquisition usually remain relatively stable,not changed dramatically.If all complete snapshot data are stored in systems,there will be a lot of waste of storage resources.Compression technology can improve storage performance largely.In the past few decades,multiple data models are developed for different demands including relational data model,semi-structured data model,data stream model and so on.Among them,array data model is fit for scenes where data contains multi-dimensional and strong structural properties and needs support for complex aggregation queries.Therefore,compression management and support for effective queries are so important for multi-version data series.This paper studies multi-version compression technology and query processing upon array data model.Multi-version compression storage trades off storage and recreation to figure out storage strategies determining materialized versions and delta ones for multi-version data series.More storage leads to less recreation and vice versa.There are three metrics for multi-version compression strategies,storage corresponding to total storage and recreation classified as total recreation and the maximum recreation.Available studies merely focus on the trade-off between total storage and total recreation or the maximum recreation so that the storage strategies acquired do not perform well on the rest dimension objectively.This paper studies trade-off between storage and recreation considering the three metrics mentioned above.As for query,structrual aggregation is important and common in array data model,including grid aggregation,sliding aggregation,hierarchical aggregation and circular aggregation.However,structural aggregation are merely studied upon one single array,not multiple array under multi-version compression.The main contributions of this paper are summarized as follows:· 1 Study the trade-off between storage and recreation in multi-version compression considering total storage,total recreation and the maximum recreation at the same time.Design two heuristic algorithms to acquire multi-version storage strategies.· 2 Define two circular aggregation queries upon multi-array under multi-version storage strategies formally and then make comprehensive use of version recreation link,equi-length circular aggregation,array partialization and buffer structure tech-nology to process queries mentioned above effectively.· 3 Conduct a series of experiments using real and synthetic data to test our proposals.As for storage,we can observe multi-version compression storage strategies in three metrics.In query,our proposals are more efficient than contrast algorithms largely.This paper studies compression technology upon array data model from storage and query point of view.
Keywords/Search Tags:multi-version compression, array data model, structural aggregation, circular aggregation
PDF Full Text Request
Related items