| With the development of the fifth generation communication technology,a large number of intelligent devices access to the network,which brings huge response pressure to the traditional cloud computing service model.In addition,delay-sensitive applications represented by autonomous driving have more stringent requirements on the computing power and network response delay of application devices.The existing device battery capacity limits the battery life of the device,which seriously affects the user experience.The proposed Mobile Edge Computing(MEC)technology can effectively solve the above problems.As one of the core technologies of MEC,computing task offloading allows User Equipment(UE)to offload computing tasks to MEC servers through communication links,which also expands the computing power of UE.In addition,by filtering and preprocessing the request data sent by the user,the amount of data transmission of computing tasks can be effectively reduced,thereby reducing the energy consumption of user equipment and extending the UE standby time.At present,there are a large number of research results on MEC task offloading.However,most of the current research scenarios are too static and lack the simulation of dynamic factors,such as user mobility,network time-varying and other factors,so that the existing research results cannot meet the actual requirements of computation offloading.To this end,combined with the powerful representation ability of deep reinforcement learning and the ability to effectively deal with the dynamic environment,this paper SIMULA-es the multi-computing task offloading scenario of a single mobile user and the multi-user single MEC server offloading scenario respectively.Considering the factors such as the time-varying characteristics of the network,the mobility of the user,and the adjustable CPU frequency of the local device,without prior experience,this paper proposes a method to solve the problem.Make real-time task offloading decisions for mobile users.Firstly,the task offloading scenario of single mobile user and multi-task is proposed.In order to truly simulate the MEC environment,the dynamic characteristics of network and user mobility are comprehensively considered,and the case that the CPU frequency of the local device can be adjusted is also considered.Based on the priority sampling theory,the DDQN algorithm was improved to obtain a new Deep reinforcement learning algorithm(Prioritized Experience Replay Double Deep Q-learning,PER_DDQN).In order to apply the algorithm to dynamic scenarios,firstly,the problem of MEC computation offloading is described as a Markov decision process,and the local computing model and edge computing model are designed respectively.According to the connection and location changes between the user and the base station,the user state space and action space are redefined,and the weighted sum of delay and energy consumption is used as the evaluation index.This paper proposes a Binary online task offloading algorithm(Binary Double Deep Q-learning,BI_DDQN),and proves the effectiveness of the proposed algorithm by comparing with the baseline algorithm.Furthermore,the research scenario is extended to discuss the task offloading scenario of a single MEC server with multiple users.The research goal is to optimize the task offloading decision and resource allocation strategy to minimize the total cost of task offloading for all users.When multiple users request to offload resources in the system at the same time,the computing tasks of different users will arrive at the edge server at the same time.In order to solve the problem of resource competition among users,a queuing mechanism is designed,which allocates computing resources according to the queuing situation of the waiting queue,processes user computing tasks in sequence,and calculates the total processing delay of tasks accurately.On this basis,combined with DDQN algorithm,a multi-user task offloading and resource allocation scheme(Off-loading Double Deep Q-learning,OFFDDQN)is proposed.Simulation results show that the proposed algorithm has advantages in saving system cost and improving the success rate of task execution. |