Font Size: a A A

Resource Allocation Of Deep Reinforcement Learning In Energy Harvesting Communication System

Posted on:2021-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:L Y LiFull Text:PDF
GTID:2428330605450089Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
In recent years,energy harvesting communications systems have been widely deployed.The energy harvesting communications systems can obtain energy from environment.Providing energy harvesting capability to communications systems is an effective method to solve the energy problem.This paper mainly studies the energy management and power allocation problem in energy harvesting communications.We adopt the time-slot structure system.There are two stages in each time slot,which includes energy harvesting and data transmission.Then,we propose a joint optimization problem with the continuous energy harvesting time and transmit power to maximize the long-term throughput for energy harvesting point-to-point communications.In general,solving the joint optimization problem need the causal information and the past information about the system,which is hard to get.In this paper,we model this problem as a Markov Decision Process(MDP).Reinforcement learning algorithm is an effective method to solve MDP.Therefore,we propose to solve this problem by using deep reinforcement learning algorithm.We apply the Q-learning algorithm to solve the problem.However,with the increase of discrete set in Q-learning,the learning speed is slower.To get continuous state space,we apply DQN algorithm to find the optimal policy.Compared with Q-learning scheme,DQN scheme has continuous state space and has better simulate performance.Although the above DQN scheme can solve the continuous state space problem,it is only applicable to discrete action space.Therefore,we adopt DDPG algorithm to solve the joint optimization problem.But the joint optimization problem results in a high-dimensional action space.We present a DRL framework for addressing this issue by combining DDPG and convex program.We decompose the original optimization problem into two-layer subproblems.Then,DDPG is applied to solve the upper-layer subproblem with a low-dimension action space.The lower-layer subproblem is convex which can be solved by the existing convex toolbox.Numerical simulation results show that,compared with the existing energy management or power allocation policy for energy harvesting communications,the proposed schemes in our paper can achieve higher long-term throughput.
Keywords/Search Tags:EH communications, reinforcement learning, deep reinforcement learning, power allocation, energy management
PDF Full Text Request
Related items