Font Size: a A A

Design And Implementation Of Distributed Scheduling And Execution System For Video Website Operation Tasks

Posted on:2022-10-15Degree:MasterType:Thesis
Country:ChinaCandidate:H Y MaFull Text:PDF
GTID:2518306602489954Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
The topic of this thesis comes from the online video website operation system project that was developed during the graduate internship.The project has hundreds of millions of users and tens of millions of program libraries.The main function is to formulate video broadcast control rules for different regional policies and playback facilities and push personalized marketing content for different user groups.With the vigorous development of the online video field,the amounts of tasks generated by operational services such as video program broadcast control and marketing has increased dramatically.The granularity of operation tasks is different,the difference of execution time is huge,and tasks of different types and priorities require different machine resources,which increases the complexity of task scheduling and execution.Users have also put forward the higher request on video websites,so video websites need to complete their operation tasks in appropriate time without affecting the normal use of users.The company's existing operation task scheduling and execution system has the following problems: Firstly,the system obtains task scheduling and execution rights by locking row of database table.The task distribution method may cause a single node to obtain nearly half of the total quantity of tasks,cause the processing speed of task is slow,and the node of cluster is load imbalance;Secondly,the cause of poor user experience is that some important tasks cannot be executed on time.There are two main reasons for this phenomenon.One is that the threads of the thread pool are hold by other types of tasks,and important tasks can allocate less thread resources.The other is that if many tasks will be executed at the same time,the important task cannot have higher execution priority;Last but not the least,the original system supports only one method of metadata injection,only supports programmatic injection,and does not adapt to the Spring framework,which reduces the work efficiency of developers.Faced with these problems,the functions and architecture of the original task scheduling and execution system can no longer meet the needs of operational tasks.So,the thesis designs and implements a distributed scheduling and execution system for operation tasks based on actual business requirements with Zookeeper.The specific work includes:(1)Demand analysis and technical overview.Based on actual business scenarios,the functional requirements and non-functional requirements of the system are defined through UML modeling,and the theories and technologies related to distributed scheduling are investigated and analyzed to complete the technical selection of the system.Choose the timing wheel as the system's timer,use Zookeeper as the system's distributed coordination service,and use My SQL as the system's data storage software.(2)System design and implementation.On the basis of demand analysis,the network architecture and logic architecture of the system are designed,and the core process of the system is explained.Basing on the functional decomposition of the distributed scheduling and execution system,it is divided into five modules: metadata management,resource management and monitoring,scheduler,executor,and high availability.The design and implementation of each module are analyzed in detail through class diagrams,sequence diagrams,and flowcharts.The system firstly reduces the dependence of the scheduling process on the database through the mechanism of time wheel and task pre-allocation;secondly,it supports the scheduling strategy that combines resources to balance the load of the cluster nodes;thirdly,it guarantees that important tasks can be executed first by introducing thread pool isolation technology and priority queues;in addition,Zookeeper is used as a resource synchronization tool to make the system stateless and support horizontal expansion;finally,it supports a variety of metadata injection methods and adapts to the Spring framework to improve the work efficiency of developers.(3)Test the system by functional and non-functional test cases.The functional test selects the core function in each module for testing and explanation.The non-functional test compared the node load and task execution delay of the original system and this system,and analyzed and explained the test results.After systematic testing,the system described in this article meets the functional and nonfunctional requirements of the business scenario,the load of cluster nodes is more balanced than the original system,and the task delay is reduced by about 33% on average compared with the original system.In summary,the system can effectively support the load balance of cluster nodes,meet the needs of execution requirements of operation tasks,improve the work efficiency of developers,which is of great significance to improve the service quality of video websites.
Keywords/Search Tags:Task Scheduling, Distributed, Operation Task, Timing Wheel
PDF Full Text Request
Related items