Font Size: a A A

Research And Implementation Of Building Data Cube Based On Mapreduce

Posted on:2016-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:D LiFull Text:PDF
GTID:2298330467492617Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Along with the development of modern network and the Internet, More and more data is accumulated on the Internet. Traditional data warehouse storage and query technology encountered a bottleneck in the development nowadays. There are some common model such as quotient cube and frag-shell cube etc. These models have corresponding algorithm to build cubes. But for dealing with massive data, these processing ways still have some serious deficiencies. The traditional quotient cube building algorithm has a significant drawback which has to deal with a large number of temporary data. Meanwhile, dealing with massive data, the parallel processing capability of traditional algorithm is low. On the other hand, traditional frag-shells algorithm relies too much on the data dispersion that it performs poorly, when confronts large amount of highly disperse data.With the development of cloud computing and big data technology, it has brought several effective solutions to solve these issues. Thus, for these issues mentioned above, this paper includes following studies:1) For dealing with the deficiency of traditional quotient cube building algorithm, in this paper, by using the method of calculating the cluster cube, proposed the QCCM (Quotient Cube Construction with MapReduce) algorithm. Based on improving the parallelism of traditional algorithm, it avoids large number of temporary tables to enhance the efficiency of the algorithm. This paper, through the simulation, proved that the improved algorithm is more effective compared with the original one.2) For dealing with the deficiency of traditional frag-shells algorithm, By analyzing the reason of inefficiency of traditional algorithm showed in high discrete, this paper proposed an improved frag-shells algorithm based on discrete independent of the data, and compared the costs of the algorithm of before and after improvement. In this paper, the simulation proved that the improved algorithm in the case of high discrete of data still have a good efficiency.Finally, this paper made an analysis and prospect for further research from different research direction.
Keywords/Search Tags:cloud computing, data cube quotient cube, Frag-shells, MapReduce
PDF Full Text Request
Related items