Font Size: a A A

Research On Distributed Query Of Quotient Cube Based On Spark

Posted on:2019-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:Z F ZhangFull Text:PDF
GTID:2438330563457677Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The quotient cube is a data cube compression technique.The aggregation value of all tuples in the equivalent class is equal,so the quotient cube can achieve the purpose of compression by calculating and saving the upper and lower bounds of the equivalence class.However,at the background of big data,the traditional database in singe node can not achieve the request of querying,analyzing and managing historical data.Therefore,this paper proposed a model named equivalent interval based on the upper and lower bounds of the equivalent class of quotient cube.At the same time,this paper defined contain,distinct,extend three relationships between the tuple and the equivalent interval based on the flexibility of the interval structure which has changeable endpoint and can match and update by using these three relationships.After the concept of equivalent interval is proposed,the distributed OLAP quotient cube query model is defined by using the Spark distributed computing framework,and the quotient cube OLAP query algorithm in distributed environment is proposed.It use the characteristics of distributed system to achieve better performance.After verified the excellent performance of query in the distributed environment,this paper also raised a quotient cube cache model based on the quotient cube OLAP query algorithm and equivalent interval.Based on this model,a distributed dynamic OLAP caching algorithm is proposed.Its advantage is that it is not necessary to calculate and materialize the quotient cube after each time historical data updated,but can drive the generation of the quotient cube through querying.Finally,the performance of the algorithm under different conditions has been analyzed by setting the total query number,the number of layers and the inclination of the data.
Keywords/Search Tags:Big data, Distribution File System, Spark, Quotient Cube, Equivalent Interval
PDF Full Text Request
Related items