Font Size: a A A

Research And Implementation Of Histogram Cube Compressed Storage And Incremental Updating And Query Under Cloud Environment

Posted on:2015-09-21Degree:MasterType:Thesis
Country:ChinaCandidate:C N ChenFull Text:PDF
GTID:2308330473953717Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As the big data and cloud computing era, such as on-line analytical processing (OLAP) need a lot of computing and storage overhead problem becomes much simpler. Affected by the sea quantification and high dimensional data, however, OLAP technology still faces serious challenges in terms of computing and storage, handling just alleviate these challenges under the distributed environment.In this thesis, presents the compression architecture of histogram data cube, Respectively from the underlying storage structure of histogram data cube, the content compression, the overall compression three aspects has carried on the optimized processing. First on the underlying storage structure,this thesis based on histogram data cube and closed data cube technology improved the storage structure of histogram data cube,presents the underlying storage structure of closed tuple+histogram;on the content compression,in this thesis based on the statistics structure information of histogram data cube put forward count invert compression method;on the overall compression,this thesis use file compression further compressed histogram data cube. Comprehensive the above three kinds of compression technology, this thesis realizes the efficient compression of histogram data cube.Build a data cube is a big spending on time, our predecessors were mostly in how little time as possible to construct a complete data cube, and data cube is an enterprise application, enterprises have continuously new data need to accumulate in the data cube.In this thesis, we analyzed the revenue and cost of incremental updating of data cube, has carried on the exploration of data cube incremental method, this thesis presents a closed cube only increase and update,no delete when carrying incremental updating,and realized the MRC-increUp algorithm under MapReduce distributed environment.The ultimate goal of OLAP is enclosed data cube’s query implementation, this thesis presented direct query based on query key and class query based on closed tuple clode, in order to realize the interactive query, this thesis introduced the Impala big data real-time query system, and put forward interative query architecture and query optimization strategies while use impala system.In this thesis use TPC-DS test data set has proved by the experiment on the compression of the data cube, and the relative to recalculate and incremental updating data cube and the advantages of relative to the previous query efficiency of query algorithm and realization.
Keywords/Search Tags:OLAP, Closed Data Cube, Histogram Cube, MapReduce, Incremental Updating, Query, Compress
PDF Full Text Request
Related items