Font Size: a A A

Research And Implementation Of Distributed Energy Data Storage System Based On Join Ordering Optimization

Posted on:2016-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:P R LinFull Text:PDF
GTID:2308330479494827Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Join optimization is always the hot spot of relational database in both single node and distributed environment. To reach the same purpose of a join query, database need to plan the execution program, in order to minimize the cost and maximum the performance. But in distributed environment, as the difference of data model, deploy model and concurrency model between every applications, it’s difficult to support a common way of join optimization for distributed database products.In the distributed energy data storage system of SCUT, to fulfill the requirements of upper analysis system, it need to execute join operation cross databases frequently. We found that different join ordering leads to great difference in performance, because the middleware currently used does not support join ordering optimization. In this situation, this paper study optimization solutions of join operations in distributed environment, design a solution that applies to current situation, and implement it with Presto.The main works of this paper are as follow:Firstly, this paper research optimizations of join operation, especially join ordering optimization. Through applying pruning algorithm to solution space, optimizing query cost model and metadata query method, we designed a join ordering solution which is suitable for current requirements.Secondly, we study the implementation of distributed database middle Presto, integrate join ordering optimization to it.Thirdly, based on the analysis for requirements of Energy Data Analysis System, the advantages and disadvantages of cloud servers, this paper put forward the idea of three-layer storage architecture. The solution of three-layer storage architecture not only improve the availability of original architecture, but also centralize the shards into virtual servers, and provide parallel query support to the upper applications.Finally, we carry out a series of functional test cases and performance test cases, shows that the new storage is able to support the upper application, and the optimization solution improves the performance of distributed join operation for our project.
Keywords/Search Tags:distributed storage system, join ordering optimization, Presto, three-layer storage system
PDF Full Text Request
Related items