| With the rapid development of wireless communication technology and the popularity of smart devices,demand for the performance of wireless communication networks has been constantly increasing.In recent years,studies about optimizing wireless communication resources have become the focus in communication research.Reasonable allocation and optimization of wireless communication resources can efficiently utilize limited communication resources,thereby improving the performance of wireless communication systems.However,the optimization of wireless communication networks often falls into nonconvex optimization issues,which are difficult to deal with.Since the deep reinforcement learning technology(DRL)is powerful in solving non-convex optimization problems,researchers tend to use it in the field of wireless communication.This thesis focuses on the Non-Orthogonal Multiple Access(NOMA)cellular network with multiple small cells,the NOMA communication system assisted by Intelligent Reflecting Surface(IRS),and the multiple IRS-assisted millimeter-wave Multiple Input Multiple Output(MIMO)system,proposing three resource optimization algorithms based on DRL.The main research results are as follows:(1)For the power allocation problem in the multi-cell NOMA cellular network,this article proposes a power allocation algorithm based on Deep Deterministic Policy Gradient(DDPG)is proposed.First,we construct a power allocation optimization model for the multi-cell NOMA cellular network.Then,the channel state information and the transmit power of the previous time slot are set as the state of DDPG,where the transmit power is set as the action of the agent,and the average sum rate is the reward.Finally,under the constraints of maximum transmit power,the DDPG algorithm is used to optimize power allocation of the system.Simulation results show that compared with the traditional benchmark algorithm,the power allocation algorithm proposed here is more effective in optimizing the sum rate,with better performance in convergence with faster and more stable convergence speed.(2)For the energy efficiency optimization problem of the IRS-assisted NOMA downlink communication system,this thesis proposes an optimization algorithm based on DRL.First,we establish a model of the IRS-assisted NOMA downlink communication system.Then,the channel state information,transmit power,receive power,beamforming matrix,and IRS’s phase shift matrix of the communication system are set as the state of DRL.The beamforming matrix and IRS’s phase shift matrix are set as the action of the agent,and the energy efficiency of the communication system is set as the reward.Finally,under the constraints of maximum transmit power and phase shift,the deep reinforcement learning algorithm is jointly used to optimize the beamforming matrix and IRS phase shift matrix,in order to maximize the system energy efficiency.Results of simulation experiments show that compared with traditional convex optimization algorithms,the algorithm proposed in this thesis has better energy efficiency optimization performance.(3)Considering the communication blind spot issue in the millimeter-wave(mm Wave)MIMO system,this thesis proposes a power allocation algorithm for the multiple IRS-assisted mm Wave MIMO communication system.First,we construct a model of the multiple IRSassisted mm Wave MIMO communication system.Then,the channel state information,beamforming matrix,and IRS phase shift matrix of the communication system are set as the state of DRL.The beamforming matrix and IRS phase shift matrix are set as the action of the agent,and the sum rate of the communication system is set as the reward.Finally,under the constraints of maximum transmit power,maximum phase shift,and minimum signal-tointerference-plus-noise ratio,the DRL is used to jointly optimize the beamforming matrix and IRS phase shift matrix,to maximize the system’s sum rate.Simulation results show that the proposed algorithm can effectively deal with the communication blind spot problem and improve the performance of the mm Wave MIMO system. |