Font Size: a A A

The Design And Implementation Of Distributed Data Warehouse Routing Middleware

Posted on:2021-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:D L ZhaoFull Text:PDF
GTID:2428330647950881Subject:Engineering
Abstract/Summary:PDF Full Text Request
As a professional data storage service provider,Transwarp Technology has provided distributed big data storage technology support for securities,banks,governments and other companies.Its one-step big data platform Transwarp Data Hub(TDH)includes software installation,operation and maintenance,while providing a lot of expansion functions.However,with the growth of customer usage data and the complication of usage scenarios,a single data warehouse cluster has become increasingly difficult to meet the customer's requirement.In order to reduce the cost of data migration,some customers have established big data storage clusters in various places.In this case,due to the scattered data,customer data search has gradually become an urgent problem.Based on the problems encountered by customers in production,this thesis analyzes the usage needs of various users in detail,and at the same time,designs and implements a data warehouse routing middleware Inceptor-Gateway(Gateway)based on the distributed data computing engine(Inceptor)of the Transwarp.The Gateway system is located between the client and the data warehouse calculation engine in the big data cluster.It is mainly composed of the client receiving layer,routing layer,and sending layer.After user connect to the gateway through the client,they can establish multiple data warehouses through the gateway at the same time.Changing the situation where only one data warehouse can be connected at a time.By configuring forwarding rules in advance,users can perform intelligent SQL request forwarding,multi-node load balancing,and other functions through Gateway.In the case of multiple data warehouses,the data structure is transparent to the client.When searching for data,users only need to focus on the writing of the SQL statements but not the specific storage location of the data.At the same time,in order to further reduce the user's use cost,a monitoring platform(DBAService)designed in cooperation with Gateway is also designed based on Spring Boot.Through the monitoring platform,the system's operating status and forwarding details can be grasped by users in real time.When problems occur in the SQL query,users can quickly locate the faulty node through the monitoring platform and promptly troubleshoot the cluster.At present,Gateway has been used in distributed storage clusters of public security,postal service and many banking institutions,and has played an important role in cluster construction and use in scenarios such as multiple data warehouse.With the help of the routing middleware Gateway,it can greatly reduce the user's threshold of use in multi-data warehouse.At the same time,it will help build a better data warehouse structure and increase data security.
Keywords/Search Tags:Distributed Data Warehouse, Middleware, SQL routing, Thrift, Spring Boot
PDF Full Text Request
Related items