Research On Distributed Learning Resource Scheduling Based On Reinforcement Learning

Posted on:2023-03-02

Degree:Master

Type:Thesis

Country:China

Candidate:C Liu

Full Text:PDF

GTID:2568306914960339

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the continuous development and expansion of deep learning,speech recognition,recommendation systems and other fields have been widely studied and applied.At the same time,with the rapid development of various technologies in modern society,the scale of collected data and information has also experienced explosive growth.In order to better fit the actual needs of users,the applied deep learning models are becoming more and more complex.Although the performance of deep learning models is getting better and better,a new problem has also arisen,that is,the time and computing resources required to train deep learning models also increase gradually with the increase in the amount of data.Distributed learning systems are also widely used in the Internet field.To improve the task throughput of a distributed machine learning system,various resources need to be fully invoked to meet the changing needs of distributed learning tasks.This paper studies three aspects of resource management,data copy management and computing resource management in distributed systems.In the direction of resource management,this paper uses the ARIMA model and the GRU model to jointly complete the traffic forecasting work,evaluate the subsequent distributed learning task volume,and then use the Exponentially Weighted Moving-Average(EWMA)model to predict the computing resources required by the future task volume.,the experimental results show that this paper can relieve the computing pressure of each computing node,thereby increasing the throughput of the distributed system;in the multi-copy management scheme,this paper uses the hot and cold degree to evaluate the data according to the amount of access.When it is larger,the data is hot data,and it is necessary to copy the data exceeding the threshold to increase the number of copies to achieve load balancing.For cold data with less access,it is necessary to reduce the data below the lower limit.The number of copies,so that the data server has more space to store hot data.In addition,this paper uses a heuristic algorithm to select the data server to provide a guarantee for the storage of data copies.The method can provide reasonable feedback on the amount of data access,and the data server selected according to the heuristic algorithm can meet the user’s access requirements and the storage requirements of data copies;in terms of resource scheduling,this paper adopts the reinforcement learning method to manage limited computing resources,To ensure that computing resources can be reasonably allocated to different parameter servers,experiments show that the reinforcement learning algorithm used in this paper can optimize the resource scheduling method of distributed learning clusters to a certain extent.

Keywords/Search Tags:

resource management, duplicate management, reinforcement learning, resource scheduling

PDF Full Text Request

Related items

1	Efficient Resource Scheduling Based On Deep Reinforcement Learning
2	Application Of Deep Reinforcement Learning In Network Resource Management Problems
3	Spectrum Planning In Cellular Network And Radio Resource Management In D2D System Based On Reinforcement Learning
4	Research On Resource Scheduling Techniques Of Medium And Low Orbit Satellite Network System Based On Reinforcement Learning
5	Research On Cloud Resource Scheduling Method Based On Deep Reinforcement Learnin
6	Large-Scale Stream Processing Task Resource Scheduling Method Based On Deep Reinforcement Learning
7	Research On Resource Scheduling And Power Distribution Stratgies In Mobile Communications System And The Implementation
8	Research On IoT Wireless Resource Management Method Based On Reinforcement Learning
9	Research On Resource Management Using Deep Reinforcement Learning In B5G Communication Networks
10	CPU/IO Resource Scheduling Research In Cloud Environment Based On Reforcement Learning