Font Size: a A A

Research On Distributed OLAP Semantic Caching Algorithm

Posted on:2018-08-18Degree:MasterType:Thesis
Country:ChinaCandidate:H B CuiFull Text:PDF
GTID:2358330518460482Subject:Computer technology
Abstract/Summary:PDF Full Text Request
A data cube model for modeling the data warehouse,by deleting the non-closed cells in its tuples and layered,the layered closed cube is formed.The Spark is a quick general big data parallel computing framework based on memory.In this paper,we design and implement two efficient distributed OLAP query algorithms based on Spark and layered closed cubes.They are SLCCQuery and its optimization algorithm SLCC_LayeredQuery.Experiments on data sets with different parameters demonstrate the efficiency of the distributed OLAP query algorithms in Spark proposed in this paper and high efficiency of its optimization algorithm.In order to further improve the efficiency of distributed OLAP query in Spark environment,this paper designs a new distributed OLAP semantic caching algorithm in Spark environment.The algorithm expresses the tuple in the collection to be queried by storing the upper and lower bounds of the equivalence class rather than the individual data tuple information.In addition,the drill-down relationship and the cache items constitute the algebraic lattice structure.When the query is truncated by semantic relation,the search scope in the cache is further narrowed.The experiment proves the validity and relative efficiency of the distributed OLAP semantic caching algorithm.The primary contents are as follows:(1)By removing all non-closed tuples and layering,a data cube can be translated to a layered closed cube,at the same time,based on Spark,two effective distributed OLAP query algorithms are designed and implemented:SLCCQuery and its optimization algorithm SLCC_LayeredQuery(2)According to the requirement of the cache design of distributed OLAP query algorithm,this paper proposes a new algorithm for querying cache,such as page caching,tuple cache,which are not used in the query cache.OLAP query Caching Technology-Semantic OLAP Caching Technology.(3)Based on the semantic OLAP caching model and Spark,this paper designs two distributed OLAP caching algorithms in Spark environment,combined with different cache replacement strategies,The experiment proves the validity and relative efficiency of the distributed OLAP semantic caching algorithm proposed in this paper.
Keywords/Search Tags:Layered Closed Cube, Spark, OLAP query, Semantic caching
PDF Full Text Request
Related items