| Industry experts predict that total data capacity demand for the next 10 yea.rs will increase by 1000%.The Fifth Generation Mobile Network(5G)requires a significant increase in data rate.connectivity,quality of service.and energy efficiency.Ultra Dense Network(UDN)greatly improves the total capacity of the system through the dense deployment of micro base stations(BS),so it is one of the hot topics in 5G resear-ch.However,while the dense deployment of BSs in UDN brings an increase in system capacity,it also raises new challenges to system configuration.First,the increasing of the number of BS makes the channel state information between users and BSs more complicated.which increases the difficulty of network control.In addition.a large number of BSs will also bring huge energy consumption and greenhouse gas emissions.Therefore,from the perspective of green communication,a new resource allocation scheme must be introduced into UDN to reduce negative impacts of densely deploying BSs.As a new emerging technology,reinforcement learning has great potential for self-exploration,self-decision and self-optimization,and is also regarded as a powerful weapon to solve the problem of dynamic resource allocation.Aiming to build an energy-efficient UDN which provides ubiquitous high bit rate services,introducing reinforcement learning into wireless communication is of great research value.Based on this,the main innovations and contributions of this paper are as follows:1)A system model of self-powered UDN is proposed.In this paper,energy harvesting is introduced into self-powered UDN,and a self-powered UDN scheme is proposed.At the same time,some key parameters in the self-powered UDN are simulated and analyzed.which provides a theoretical basis for the setting of system parameters.2)Under single cell scenario,a power allocation scheme based on DQN is proposed.Considering the complexity of real-time power allocation problem in UDN.this paper proposes a feasible power allocation scheme.Simulation shows that the scheme can dynamically adjust the transmit power of the BSs according to the loads and battery power of the BSs.then the goal of energy efficiency optimizing is achieved.3)Under multi-cell scenario.,in order to improve data efficiency and system performance,this paper proposes a distributed power allocation scheme based on Proximal Policy Optimization.Simulation shows that the scheme can effectively utilize the information of multiple cells to achieve joint power control of multi-cells.At the same time.the convergence speed and system performance are improved comparing with the DQN scheme proposed above. |