Font Size: a A A

The Research On D2D Power Control Based On Deep Learning

Posted on:2022-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:J Q ShiFull Text:PDF
GTID:2518306524984089Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
5G technology promotes revolution to the communication field.In order to improve the energy efficiency of cellular networks and system throughput,device to device(D2D)technology is a feasible solution.In the D2D network,D2D pairs coexist with full frequency multiplexing resulting in complex interference.Power control is used in D2D systems for interference management to optimize the system capacity.Most of the traditional D2D power control algorithms are based on real-time channel information.The time required for complex matrix operations and channel estimations makes these algorithms much more difficult to implement into real communication systems.However,in recent years,with the development of computer hardware and computing ability,neural network once again bursts into a strong vitality.The excellent ability of deep learning and reinforcement learning in strategy fitting inspires us to introduce them into the communication field.In this article,we introduce deep learning and deep reinforcement learning into the power allocation problem of D2D systems.The research contents are summuraized as follows:First,we model the system as a Markov chain through the study of the temproal correlation of the communication channel,and then the entire problem is equivalent to a Markov chain processing problem.This construction of D2D power allocation problem conforms to the basic assumptions of sequential processing and reinforcement learning in deep learning.Different from the traditional power allocation algorithms,we use outdated channel information instead of real-time channel information as algorithm input.Second,we propose a supervised D2D power allocation method LFE-RNN based on deep learning.We design linear filters with various sizes termed LFEs to extract the different local interference patterns from outdated channel information.This feature extraction process enables our network to precisely learn the interference patterns around D2D links,so as to provide more effective power allocation strategies.In addition,we propose to predict the real-time interference pattern based on the outputs of the LFEs and further make power decision.The prediction and decision can be modelled as a Markov decision problem(MDP)and solved by using a recurrent neural network.Third,an input reduction process is also designed to reduce the input size fromO(N~2)to O(1),which speeds up the operation time and reduces the system overhead.Finally,extensive simulation results show that the proposed algorithm achieves an encouraging performance compared to the state-of-the-art power allocation algorithm.Third,we consider the existence of a situation in the above algorithm that the link with good communication quality continuously transmits information,while the link with poor communication quality cannot transmit information at all.In order to ensure the fairness of each link,we define a proportional fair weighting method to enable each link to have same probability of transmitting information.Furthermore,we study the processing changes of LFE-RNN algorithm under proportional fairness,and verify that our algorithm is still effective.Finally,due to the limitation of the supervised algorithm,the above proposed IFR-RNN can not break through the upper limit of the traning suboptimal traditional power allocation algorithm.We hope to further improve the performance to approximate the sum rate upper limit of exhaustive search.Thus,we further propose an unsupervised power allocation algorithm named MARL-PC(Multi-Agent Reinforcement Learning Power Control)based on multi-agent reinforcement learning.The classic reinforcement learning algorithm Deep Deterministic Policy Gradient(DDPG)is able to solve the decision-making problem in the continuous action space.On this basis,we treat multiple links as separate agents.Unlike DDPG,we add the decision information of other links to the state information of each agent,and set the reward of each network to global sum rate.The simulations prove that agents can learn to cooperate with other links to obtain the optimal sum rate through this algorithm.
Keywords/Search Tags:Deep learning, D2D networks, power allocation, recurrent neural network, outdated interference information, proportional fairness, reinforcement learning
PDF Full Text Request
Related items