General Cloud-native Big Data Architecture With Kubernetes

Posted on:2022-12-14

Degree:Master

Type:Thesis

Country:China

Candidate:S Du

Full Text:PDF

GTID:2518306773997719

Subject:Automation Technology

Abstract/Summary:

PDF Full Text Request

With the continuous expansion of data volume,the performance of traditional data processing technology is gradually unable to meet the demand.In order to solve the challenges brought by data growth,various data processing technologies have been gradually developed,which has improved the data processing ability to a certain extent,but it has brought the problems of technical silos,scattered data,complex architecture and difficult maintenance,resulting in increasingly high data cost.With the vigorous development of cloud computing technology,"everything goes to the cloud" has become the normal state of the era.Not only ordinary systems could take advantage of the cloud to improve their overall capabilities,but data processing technology can moreover rely on the cloud to improve the performance to reduce the costs.Although some new data processing technologies,take cloud native as cornerstone,combine various technical means,integrate the transaction processing and analytical calculation,and provide a unified access interface,but as they are still in the early stage of development,and limited by many factors such as infrastructure and technological development,they can't completely replace the traditional data processing technologies in a short term.Combining traditional data processing technology with cloud,in this way,the performance can be improved,the cost can be reduced.This is one of the main methods to reduce data inflation challenge,lower system complexity and obtain data cost savings.But there is not much related work on this research.The integration of data processing technology and cloud usually needs special transformation,which cannot be generalized.The facts that the cloud storage performance is not high,the traditional storage is inflexible,and the overall complexity is high,make it difficult to make good use of the advantages of cloud technology.In order to solve these problems,this paper proposes a general primary cloud big data architecture based on Kubernetes.1.Based on mature cloud container orchestration technology,the modular and low-coupling primary cloud operating environment is designed,which improves the efficiency of data processing technology on the cloud.2.Based on cloud storage,data acceleration middleware,container storage interface and other technologies,the cloud storage acceleration strategy is designed,which realizes the structure of separation of storage and calculation.This makes the data system on the cloud have strong scalability and flexibility,improves the data storage scale and reduces the data cost.3.By introducing the new parallel computing strategy,the defects of traditional large-scale parallel computing are avoided,the overall query performance is improved,and TPS response is more stable.4.By introducing the mature technologies of logging,monitoring,tracking etc.the standard observability strategy of large-scale cloud resources is designed,which improves the system robustness and reduces the maintenance complexity.Finally,on Click House,the analytical data processing technology,the architecture proposed in this paper is verified.Through a large number of experiments come from aspects of query performance,TPS response,etc.compare with the original architecture,the result shows that the query performance is improved by 18%?60%,and the data cost is reduced by 50%?90%.It is superior to the original architecture in data scale,consistency,reliability and system scalability,and is easier to operate and maintain.

Keywords/Search Tags:

Big Data, Cloud Computing, Integration of Big Data and Cloud, Separation of Storage and Calculation, Massively Parallel Computing, Observability

PDF Full Text Request

Related items

1	Research On Secure And Efficient Cloud Storage Based On Fog Computing Schema
2	Design And Implementation Of Digital Library Resource Integration Solution Based On Cloud Computing
3	Research On Data Computing Offloading And Distributed Cloud Storage Management Issues In Mobile Cloud Environments
4	Look For Smooth Cloud Computing-based Data Management Platform For The Development,
5	Based On The Cloud Computing Security Protocol Analysis And Design
6	Research And Implementation Of Disaster Big Data Management Methods Based On Cloud Computing
7	Research On Energy-Efficient Improvement Methods For Storage And Computing Layer Under Cloud Computing Environment
8	Research On Cloud Computing For Massive Data Process And Its Key Technologies
9	The Research For Data Storage And File Management Of Cloud Computing Platform
10	Performance Analysis Of Cloud Computing Centers Engaged In Big Data Applications