Design And Implementation Of Big Data Application Scheduling System

Posted on:2020-10-20

Degree:Master

Type:Thesis

Country:China

Candidate:M G He

Full Text:PDF

GTID:2428330578457159

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of the Internet,people have entered the era of big data and mobile Internet.Big data application is the use of data value,that is,through data analysis to extract effective information from massive data,to provide users with decision support[6].How to efficiently process these data becomes the key,and the scheduling between tasks in data processing is of great significance to the overall performance and resource utilization.This paper is a big data application scheduling system based on the actual needs of X company combined with the current open source azkaban scheduling system.This paper firstly analyzes the company's main requirements in task scheduling through the actual business needs of the company,and conducts technical research on several major open source scheduling systems currently on the market,and proposes a suitable implementation scheme to determine the selection of the scheduling system.type.Secondly,through the analysis of the demand and the research and analysis of the technology,the design and development based on the open source Azkaban scheduling system is finally selected.The system adopts the micro-service technology architecture to divide the system into the web management part,the dispatcher part and the actuator part.Three parts.The system uses cluster deployment to achieve high availability of scheduling.The web management part adopts SSM(Spring&SpringMVC&MyBatis)architecture to realize the web interface operation of the management side.At the same time,the graphical editing interface IDE not only realizes the workflow of the scheduling system.The core logic of scheduling also supports drag and drop editing on complex DAG workflow lines.In the executive part,Azkaban's plug-in mechanism can be used to support job plug-ins in different scenarios to implement workflow scheduling for different task types.In addition,the containerization technology Docker is used to isolate the workflow execution environment and avoid the interaction between workflows in different environments.Finally,the system is optimized with the internal use of the company to make it an efficient big data application scheduling system.At present,the big data application scheduling system mentioned in this article has been running normally in the production environment.According to the actual online operation effect,the dispatching system can complete the daily business needs of the company and support high concurrent scheduling of tens of thousands of tasks.The post-system will also be continuously optimized and optimized,and iteratively upgraded to become a core big data application product of the company.

Keywords/Search Tags:

big data application, dispatch system, Azkaban

PDF Full Text Request

Related items

1	P2P Algorithm Study For Massive Data Dispatch Of WebVR Platform
2	Design And Realization Of Web-based Application Framework Of College Graduates Combined Dispatch System
3	Research On The Dispatch Method Of On-Chip Percolation Data For Single-Core Processor
4	Smart Taxi Dispatch System Based On Real Traces
5	Design And Realization Of Talent Dispatch System
6	The Application Of J2EE In Labor Dispatch System
7	Research On Multiobjective Evolutionary Algorithms And The Application In Load Dispatch Problems
8	Research On Data Dispatch Technology In Distributed Stream Join Systems
9	The Research And Development Of Train Dispatch And Management System Based On GPS
10	Research On Radio Frequency Identification Technology And Application In Public Traffic Dispatch System