Font Size: a A A

Computation And Storage Of PrefixCube

Posted on:2005-03-27Degree:MasterType:Thesis
Country:ChinaCandidate:Q FangFull Text:PDF
GTID:2168360152469129Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In order to promptly answer complex multidimensional queries in Online Analytical Processing (OLAP) applications, Data Cube is always needed to be pre-computed and physically stored. However, the huge size of data cube introduces a series of problems with respect to its computation and storage. High volumes of disk space are needed to store cube tuples, and in the whole overhead of cube computation, the I/O cost for storing cube result tuples is dominant. To solve these problems from the root, it is exigent to explore efficient data cube computation methods and cube storage structures.Condensed Data Cube has been proposed as an effective approach for reducing data cube's size. BST condensing, one of the main condensing mechanisms of condensed cube, means condensing those tuples, aggregated from the same single base relation tuple, into one physical tuple in order to reduce the cube's huge size. BST condensing is actually a special kind of prefix-sharing. Intra-cuboid prefix-sharing technique can further reduce the data cube's size by eliminating prefix redundancies existing among cube tuples within a cuboid. Logically, combining these two prefix-sharing techniques introduces a new data cube structure – PrefixCube. PrefixCube first clusters cube tuples in a BST-condensed cube cuboid by cuboid and then eliminates intra-cuboid prefix redundancies, which eventually reduced the BST-condensed cube's size and hence reduced cube's whole computation time by writng a much smaller cube file.In order to identify the common prefix between tuples, many tuple comparisons are needed, which is unbeneficial to reducing the cube's computation time. Therefore two optimizations were proposed. One is eliminating tuple comparisons in single-grouping-attribute cuboid such as Cuboid(A); the other is computing PrefixCube in batch mode so as to eliminate comparisons among tuples generated in the same batch.In real OLAP applications, rollup and drilldown queries based on dimension hierarchies are very common and also very important. However, the presence of dimension hierarchies largely increases the complexity of data cube computation. By extending the computation algorithms and organization methods of PrefixCube, new algorithms of computing hierarchical PrefixCube were proposed and the organization of hierarchical PrefixCube was also realized.
Keywords/Search Tags:online analytical processing, data cube, PrefixCube, base single tuple, prefix-sharing
PDF Full Text Request
Related items