Font Size: a A A

Research On SQL Parallel Query Optimization Based On Docker

Posted on:2020-02-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y H WangFull Text:PDF
GTID:2428330602955783Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
At present,human beings have entered the era of big data cloud computing,and data is generated everywhere in life.With the development of computer technology,the rapid development of the Internet and the Internet of Things has brought great challenges to the collection and processing of large-scale data.Early SQL queries were affected by single-point calculations.They were unable to perform high-consistent query operations and were difficult to implement a perfect isolation mechanism,which could not meet the growing demand for data query processing.Therefore,traditional SQL queries can no longer meet the data processing needs,and container technology is a hot emerging technology in recent years.The container is quickly favored by technology developers because of its lightweight,read-write separation and high portability.The optimization of container-based SQL parallel query is worthy of discussion and research,so that the growing data can be processed more efficiently and efficiently.Parallel query refers to allowing multiple queries to coexist and execute simultaneously on the same computer.The advantage of SQL parallel query is that it can maximize the computing resources of multiple nodes,which can improve the efficiency of the database.Docker container technology is an LXC-based advanced container engine open sourced by PaaS provider DotCloud.The source code is hosted on GitHub,based on Go and Apache 2.0.Since its release in 2013,Docker has been used and improved by scientists and computer technology enthusiasts worldwide at an alarming rate.As a lightweight virtual technology,container technology can ensure performance and physical machine performance,and can achieve on-demand expansion,can reduce network IO,can also achieve separation of computing and storage in the database,and improve SQL query performance.This paper optimizes SQL parallel query based on Docker container from three aspects to achieve efficient use of computing resources.Firstly,the basic process of SQL parallel query is researched and analyzed,and the connection query optimization algorithm in distributed environment is discussed.In this paper,for SQL parallel query optimization,the traditional SQL statement is divided into logically related multiple subquery statements.Each subquery can be executed in parallel according to the optimal order.Finally,the results of each subquery are logically combined to obtain the same query result as the original query.SQL parallel queries can effectively utilize the computing resources of the system and prevent the extremes of nodes in the system from being overloaded or under-loaded.Secondly,the Docker container in the SQL parallel query environment is optimized.Through mirror optimization,building a private image repository and Docker continuous integration optimization,the container can quickly start and stop and minimize the occupied physical resources.Docker supports self-built images and private image repositories,which can be optimized according to actual needs to efficiently and fully utilize computing resources.By reducing the size of the image and self-built enterprise-level private image warehouse,a series of measures can effectively reduce the network time-consuming and achieve rapid deployment.Docker technology also supports a series of cluster orchestration and monitoring tools to better use Docker clusters.Building database databases with Docker technology can take advantage of Docker's isolation and rapid distribution of updates.Finally,this paper studies the advantages of combining Docker container technology with SQL parallel query,and uses Docker container technology to build a SQL parallel query processing system with unified image on distributed nodes.In the last part of this paper,the overall design analysis and key technology research are carried out,and the efficiency of the system is tested and analyzed.The innovation of this article is the perfect combination of SQL parallel queries and the process-level virtualization mechanism of Docker containers.Docker's "slimming"virtualization mechanism saves computing and storage resources more than ordinary physical machines and virtual machines.It is more suitable for the construction,deployment and development of computing clusters.It can focus on SQL parallel query and optimize deployment.The operation is safe and reliable,and contributes to scale-out horizontal expansion.In terms of fault tolerance mechanism,once the downtime affects the query results in the distributed Docker cluster,the Docker container cluster management scheme can be used for overall rollback.Containerizing all non-system applications helps to quickly start the standby compute container node.Increasing the container cluster scheduling optimization process can more differentiate the physical nodes and increase the utilization of system resources on the basis of the system's "slimming",so this topic has great research and application value.
Keywords/Search Tags:Docker container, SQL, Parallel query, Virtualization
PDF Full Text Request
Related items