Font Size: a A A

Resource Optimization Algorithms Based On Deep Reinforcement Learning For Mobile Edge Computing

Posted on:2022-10-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y P WangFull Text:PDF
GTID:2518306563477144Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of 5G network global deployment process,all walks of life put forward new computation-intensive and delay-sensitive applications running on User Equipment(UE).Although Mobile Cloud Computing(MCC)uploads high computing tasks to the centralized cloud server through the public network,which enhances UE computing and reduces the energy consumption of UE,the distance of the cloud server from UE leads to high transmission delay.Mobile Edge Computing(MEC)transfers computing and storage resources to the edge of the mobile network,enabling UE to run applications with high computing resource requirements,while meeting strict latency requirements.In the traditional MEC scenario,in order to make full use of the resources of UE and MEC server based on fixed base station,most scholars propose the joint optimization of communication,storage and computing resources.However,in a communication facility or UE with high mobility,the mobile network topology changes rapidly,and traditional optimization methods are difficult to solve the problem of multi-dimensional computing resources and communication resource allocation.In view of the resource allocation problem in mobile edge computing scenarios,especially in the field of transportation,many scholars currently use a centralized Deep Q Network(DQN)based on the discrete action space for decision-making control.Different from the above research,this article is oriented to the research of continuous action space.In order to solve the problems faced by the scene,the main research work is as follows:Firstly,aiming at the resource allocation problem in the multi-user mobile edge computing scenario of a single Unmanned Aerial Vehicle(UAV),a joint user scheduling,UAV mobility and computation offloading decision-making optimization scheme is proposed,which reduces the processing delay of the UE.In this paper,a network communication and computation model are established to minimize the maximum total processing delay of all time slots.Considering the continuity of the offloading decision variables,a Deep Deterministic Policy Gradient(DDGP)algorithm based on continuous action space is adopted to jointly optimize user scheduling,UAV mobility and computing task allocation.Extensive experiments have been conducted.The results show the convergence of the proposed algorithm,and compare the performance of the algorithm under the hyperparameter conditions of different neural networks.Compared with other algorithms such as DQN,Actor Critic and random algorithms,the proposed algorithm can obtain the lowest total task processing delay under different task sizes,UE computing power and bandwidth conditions.Secondly,aiming at the problem that it is difficult to use centralized collection environment information for resource allocation under the rapidly changing channel conditions in the vehicular network,a multi-agent distributed deep reinforcement learning algorithm based on continuous action space,e.g.MADDPG,is adopted to optimize shared spectrum resources,which improved the total capacity of all V2 I links and the transmission rate of all V2 V links.Each car acts as an independent agent,and they interact with the car networking environment to obtain the same reward.Through centralized training of the Critic network and distributed execution of decisions output by the Actor network,multi-agents learned to cooperate with each other.Numerical results show the convergence and robustness of the proposed algorithm and the performance of each agent Compared with MADQN,DDPG and random algorithm,the proposed algorithm has higher performance in terms of total V2 I link capacity and V2 V link load transmission success probability.Compared with the random algorithm,the V2 V link optimized by the proposed algorithm can complete load transmission faster through cooperation.
Keywords/Search Tags:Mobile Edge Computing, Deep Reinforcement Learning, Computation Offloading, Resource Allocation
PDF Full Text Request
Related items