Font Size: a A A

Research On Approximate Query Processing Technology Based On Multidimensional Analysis Of Big Data

Posted on:2018-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:J X XieFull Text:PDF
GTID:2358330536488532Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the vigorous development of information technology,the Big data age has come.More and more organizations and institutions in industry and academia are increasingly inclined to extract valuable information from massive amounts of data to support business decisions.The multi-dimensional analysis technology can analyze massive data from multiple dimensions and levels,and provide powerful decision support services for enterprises.However,the multi-dimensional analysis usually requires to process a large scale of data sets,and it is clear that it is not possible to load all the data completely into memory,resulting in a large amount of execution time even for an ordinary aggregate query.Specific business analysis often only need to grasp the general trend of development,does not require a complete and accurate results.Therefore,the approximate query processing techniques can be applied to such analysis scenarios.This paper mainly studies the approximate query processing technology.For the situation that the query efficiency is too low in the multi-dimensional analysis of large data,we study the sampling technique.In the study,we propose an approximate query algorithm based on clustering stratified sampling(CSSAQP).Then,we design an approximate query processing engine by using Hadoop platform and Hive system.The engine implements random sampling,stratified sampling and sampling algorithm proposed in this paper.Finally,the experiment proves the rationality and validity of CSSAQP.
Keywords/Search Tags:Big data, multi-dimensional analysis, AQP, sampling, approximate query processing engine
PDF Full Text Request
Related items