In recent years,with the development of smart mobile devices,more and more applications are running on mobile devices.The limited resources and processing power of mobile devices cause some applications to fail to meet users’ quality of service.The way to overcome this problem is to offload computationally intensive tasks on mobile devices to cloud servers deployed on the edge of the network.This is called mobile edge computing(MEC).However,the computing resources on the mobile edge cloud server are limited,and different offloading strategies and resource allocation methods can significantly affect the quality of service of users.Therefore,how to effectively formulate resource allocation strategy is the focus of research in MEC field.Based on the multi-cell and multi-user scenario in MEC system,this paper conducts an in-depth study on offloading and resource allocation.The joint offloading and resource allocation based on multi-agent reinforcement learning with variable learning rate(JORA-MV)and collaborative resource allocation based on deep reinforcement learning(CRA-DRL)are proposed respectively.The main work of this thesis is as follows:(1)The joint algorithm of offloading and resource allocation for multi-agent reinforcement learning based on variable learning rate is proposed.The algorithm takes into account the offloading decision of the users and the joint allocation of wireless resources and computing resources.The cost of local computation and offloading is obtained and compared as the decision basis of offloading.Wireless resources are allocated in the form of subcarriers.The users’ utility function is defined to include data transfer rate and overhead.Through the reasonable allocation of computing resources and subcarriers,the offloading probability of the users’ task is increased,and the sum of utility function is maximized.A multi-agent reinforcement learning model based on variable learning rate is established.Each user,as an agent,learns strategies by interacting with the environment to obtain reward.The learning speed is changed adaptively according to comparing the current and expected rewards of the agent.When the agent’s current reward value is less than the expected reward value,the learning rate is increased to adapt to the changes of other agents’ strategies.When the agent’s current reward value is greater than the expected reward value,the learning rate is slowed down to give other agents time to adapt to the change of strategy.Simulation results show that the proposed JORA-MV algorithm performs well in convergence performance.Compared with other algorithms,the delay and energy consumption are lower,and the offloading rate is higher.(2)A collaborative resource allocation algorithm based on deep reinforcement learning is proposed.The algorithm not only considers the problem of resource allocation for mobile terminals,but also improves the effective utilization of resources through the collaboration between multi-cell service providers(SP)and the sharing of computational resources.The algorithm consists of two modules: resource allocation and resource sharing.Resource allocation module includes base station selection of users and subcarriers allocation of SPs.First,the base station connection matrix of users is obtained according to the maximum signal-to-noise ratio principle,and then the subcarrier connection matrix of SPs is obtained by using deep reinforcement learning to maximize the total system capacity.The resource-sharing module first calculates the average SP delay.SPs below the time delay threshold borrow resources from other SPs.In order to minimize the sum of the system delay and the cost of resource sharing,deep reinforcement learning is used to solve the problem.Simulation results show that the CRA-DRL algorithm can effectively improve the effective utilization of computing resources while satisfying all users’ time delay requirements. |