Computation And Storage Of PrefixCube

Posted on:2005-03-27

Degree:Master

Type:Thesis

Country:China

Candidate:Q Fang

Full Text:PDF

GTID:2168360152469129

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

In order to promptly answer complex multidimensional queries in Online Analytical Processing (OLAP) applications, Data Cube is always needed to be pre-computed and physically stored. However, the huge size of data cube introduces a series of problems with respect to its computation and storage. High volumes of disk space are needed to store cube tuples, and in the whole overhead of cube computation, the I/O cost for storing cube result tuples is dominant. To solve these problems from the root, it is exigent to explore efficient data cube computation methods and cube storage structures.Condensed Data Cube has been proposed as an effective approach for reducing data cube's size. BST condensing, one of the main condensing mechanisms of condensed cube, means condensing those tuples, aggregated from the same single base relation tuple, into one physical tuple in order to reduce the cube's huge size. BST condensing is actually a special kind of prefix-sharing. Intra-cuboid prefix-sharing technique can further reduce the data cube's size by eliminating prefix redundancies existing among cube tuples within a cuboid. Logically, combining these two prefix-sharing techniques introduces a new data cube structure â€“ PrefixCube. PrefixCube first clusters cube tuples in a BST-condensed cube cuboid by cuboid and then eliminates intra-cuboid prefix redundancies, which eventually reduced the BST-condensed cube's size and hence reduced cube's whole computation time by writng a much smaller cube file.In order to identify the common prefix between tuples, many tuple comparisons are needed, which is unbeneficial to reducing the cube's computation time. Therefore two optimizations were proposed. One is eliminating tuple comparisons in single-grouping-attribute cuboid such as Cuboid(A); the other is computing PrefixCube in batch mode so as to eliminate comparisons among tuples generated in the same batch.In real OLAP applications, rollup and drilldown queries based on dimension hierarchies are very common and also very important. However, the presence of dimension hierarchies largely increases the complexity of data cube computation. By extending the computation algorithms and organization methods of PrefixCube, new algorithms of computing hierarchical PrefixCube were proposed and the organization of hierarchical PrefixCube was also realized.

Keywords/Search Tags:

online analytical processing, data cube, PrefixCube, base single tuple, prefix-sharing

PDF Full Text Request

Related items

1	Structural Index For PrefixCube
2	The Online Mining Of Data Cube Gradient
3	Novel techniques for data warehousing and online analytical processing in emerging applications
4	Research On The Storage Technique Of Data Cube Based-on Dimension Hierarchy
5	Research On Fast Data Cube Computation Method Based On Spark Platform
6	OLAP Algorithm Research Based On Dimension Hierarchy For Data Cube
7	Research On The Efficient Materialization And Fast Query Of Condensed Data Cube
8	Online Analytical Processing And Applications
9	Research On The Technology Of Label Cube
10	Design And Implementation Of Online Marketing Data Analysis Platform Based On The Materialized Data Cube