Design And Implementation Of Highly Available Distributed Task Scheduling And Execution System

Posted on:2020-11-09

Degree:Master

Type:Thesis

Country:China

Candidate:K Wang

Full Text:PDF

GTID:2428330602950402

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

This paper is based on a system project for the big data processing of Internet products that was developed during the postgraduate internship process.There are ten million users of Internet products and the number is still increasing steadily.In order to distinguish different users and interact with the target users to maintain user loyalty and stimulate new users' interest,it is necessary to process the relevant data of the existing full users,and filter out the target users and related information.Because all user-related data of the Internet product is stored in relational databases,the traditional method of processing data is to use a multi-threaded programming single-machine deployment program which has the problems of low execution efficiency and poor reusability.Another approach for this problem is to migrate the target data to a non-relational database.And then use the mature big data processing tools to process the relevant large-scale data.But it is very difficult to build a migration model which supports data integrity based on the current complex physical storage model.In order to deal with these problems,this paper combines the research of distributed technology and the actual business requirements to implement a high-availability distributed task scheduling and execution system based on Zookeeper.The system consists of a unified gateway module that interacts with the external environment,a task scheduling and distribution module that splits and distributes data processing tasks,a task execution module that performs data processing tasks,a high availability guarantee module that guarantees high availability of the system,and a log module.The system can receive various types of data processing tasks which the target data store in the relational database,and can satisfy tasks with different amount because of the design of a separate task scheduling and allocation module and scalable task execution modules.Consider of the importance of task allocation module and the need for multitasking,the high availability guarantee module of the system is designed and implemented.The system uses two machines to deploy task scheduling and allocation modules,one for working node and the other for standby node.It replaces the failed working node by the standby node automatically to achieve high availability of the system.The result of the complete functional test and performance test for this system shows that the high-availability distributed task scheduling and execution system implemented in this paper is in line with expectations.The execution efficiency of tasks with large amount data is much higher than that of traditional multi-thread programming single-machine deployment programs.And in theory the system can increase the task processing power of the entire system by increasing the task execution node.Finally,the system implemented in this paper has quite good business independence,related scalability,high availability and so on.

Keywords/Search Tags:

Zookeeper, Distributed Systems, Big Data, Task execution, Task Assignment

PDF Full Text Request

Related items

1	Researches On Probing-based Task Assignment
2	Research On Task Distribution Algorithms In Mobile Edge Computing
3	Knowledge-Enabled Task Execution And Service Application For Networked Robotic Systems
4	The Design And Realization Of Task Scheduling Algorithm In Distributed Real-time Systems
5	Based Research And Implementation Of Intelligent Task Allocator Overhead Analysis
6	Constrained Task Assignment and Scheduling On Networks of Arbitrary Topology
7	Design And Implementation Of Distributed Task Scheduling System
8	Development of a task assignment tool to customize job descriptions and close person-job fit gaps
9	Research On Distributed Deep Learning Task Assignment Algorithm Based On Blockchain
10	Research On The Method Of Hunting Task Assignment Based On Energy Balance