| With the rapid development of 5G wireless technology and the accelerated arrival of the emerging Internet of Things era,the Internet of Everything and ultra-reliable and low-latency communication have become our common expectations.In order to overcome the limitations of mobile devices in performing computationally intensive workloads,Mobile Edge Computing(MEC)emerged as the times require,which can effectively support computation-constrained and energy-constrained Internet of Things(Io T)devices to perform computationally intensive and latency-critical applications,becoming a key technology for next-generation networks.In recent years,the computation offloading and resource allocation problems of MEC systems have attracted extensive attention from academia and industry.However,solutions based on traditional optimization theory usually require complex iterations and only approximate optimal solutions can be obtained.In addition,statistical information of the known environment is required,which is difficult to obtain in practical MEC systems.In order to meet these challenges,a large number of researchers have turned to the Markov Decision Process(MDP)to establish the dynamic control model of the MEC system,and applied reinforcement learning(RL)or deep reinforcement learning(DRL)method to solve the corresponding problem.However,most of the existing researches focus on a single MEC server scenario,and there are few studies on computing offloading and resource allocation in a multi-MEC server scenario.Based on this,the main research work of this paper is summarized as follows:First,for the multi-MEC server multi-user scenario,a joint user association and power allocation optimization scheme is proposed to minimize power consumption and queuing delay.First,a communication and computation offloading model is established based on the comprehensive consideration of random task arrival,time-varying wireless channel,and queue buffering of MID and MEC servers.Then,the optimization goal is to minimize the average long-term service cost(power consumption and queuing delay).Aiming at this research goal,a dynamic computation offloading and resource allocation algorithm based on hybrid decision deep reinforcement learning is proposed,which utilizes Deep Deterministic Policy Gradient(DDPG)and Dueling Double Deep Q-network(D3QN)to improve the Actor-Critic architecture,which handles continuous power allocation by calling the Actor part of DDPG.Then the Critc part of DDPG is combined with D3 QN to handle the association of MID and MEC server to solve the hybrid decision problem in multi-MEC server multiuser scenarios.Simulation results show that the proposed algorithm has better stability and faster convergence than baseline algorithms such as DQN.At the same time,under different task arrival rates,the average system service cost of the proposed algorithm is significantly reduced.Second,for the cloud-edge collaboration scenario of multi-MEC servers and multi-users,a task offloading and resource allocation scheme is proposed to jointly optimize local computing resources,task division factors and user associations to minimize service delay and energy consumption.First,a network and computing offload model is established based on comprehensive consideration of cloud-edge collaboration,random channel state switching,MEC server computing resource allocation,decoding error probability and energy harvesting.Then,the optimization goal is to minimize the average long-term service cost(service delay and energy consumption).Aiming at this research goal,a dynamic computation offloading and resource allocation algorithm based on hybrid decision deep reinforcement learning is proposed.The algorithm utilizes deep deterministic policy gradients and dueling double deep Q networks to improve the Actor-Critic architecture,and adopts centralized training and decentralized execution.The framework to realize collaborative computation offloading between MIDs.In addition,through calling the Actor part of DDPG to process continuous task segmentation and local computation resource allocation,the Critic of DDPG is combined with D3 QN to realize discrete MID and MEC or cloud server association,so as to solve the hybrid decision problem in multi-MEC server and multi-user cloud-edge collaboration scenarios.The simulation results show that the proposed algorithm has better stability and faster convergence than baseline algorithms such as DDPG-DQN.At the same time,under different task arrival rates,the average system service cost is significantly reduced.In addition,it is verified that the cloud-edge collaborative processing has better performance than other benchmark schemes under different system parameters. |