Font Size: a A A

Optimization Method To Calculate Of High-dimensional Data Cube Based On The Shell-fragment

Posted on:2016-05-09Degree:MasterType:Thesis
Country:ChinaCandidate:W JiangFull Text:PDF
GTID:2348330479954703Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the era of big data, in the process of enterprise information, it will generate a great deal of data every day, and the staff need to query useful information from these data dues at any moment, and these data are generally multi-dimensional hierarchical. For these characteristics we use multi-dimensional hierarchical data cube model to pre-compute data and facilitate subsequent inquiries.Some related researchers have proposed many computing methods on data cube, including the calculation of high-dimensional data cube. The analysis of these proposed computing methods shows that these algorithms are still to be improved in some aspects. For example, the BUC can not be adaptable to incremental updates and it will be affected by the order of dimension. As for the shell fragment, if the cardinality of dimension is very big, the calculation amount will increase. And people always focus on few important dimensions, so a lot of calculations are superfluous and worthless.To solve these problems, it proposes a method called IMC based on shell fragment and dimension hierarchical encoding. Firstly, according to the needs of enterprise information, we choose a few dimensions related to the query to form sub-cube shell fragments and keep the inverted index list of other dimensions, thus reducing the number of materialized cuboids and the storage space. Secondly, for these high-level dimensions, we use dimension hierarchy tree instead of dimension hierarchy table to save storage space, and it will be adaptable to incremental updates. At last, for the construction of inverted index of different hierarchies, we take advantage of bitwise AND of dimension hierarchy encoding and hierarchy mask to construct the inverted index of higher hierarchies. The experiments show that the IMC can effectively improve the speed of calculation, and reduce the storage space and be adaptable to incremental updates. Additionally, the improved method is good for the data processing of enterprise information and decision support.
Keywords/Search Tags:High-dimensional data cube, sub-cube shell fragment, dimension hierarchy encoding, hierarchy mask, inverted index
PDF Full Text Request
Related items