Font Size: a A A

Research On Joint Resource Optimization Method Of Underwater Acoustic Cooperative Communication Network Based On Reinforcement Learning

Posted on:2023-12-25Degree:MasterType:Thesis
Country:ChinaCandidate:L LiFull Text:PDF
GTID:2558306848466614Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the gradual depletion of terrestrial resources,more and more people pay attention to the ocean,which has a lot of mineral resources and biological resources.Underwater acoustic communication technology is an important technical support for ocean exploration,development and utilization.Underwater acoustic communication networks provide data acquisition and transmission services for various underwater equipment.However,underwater acoustic communication networks have the problems which are long delay,absorption attenuation,energy limitation and difficulty in obtaining perfect channel information in complex and changeable underwater environment.Developing underwater acoustic cooperative communication networks has been a trend to ensure the quality of information transmission.Therefore,it is pivotal to rationally allocate the limited resources of underwater acoustic cooperative networks under different communication environments,and to develop efficient and intelligent resource allocation algorithm to improve the capacity and quality of service of the networks.The above mentioned is also the main content of this study.Firstly,aiming at the time-varying communication environment with static energy,a joint relay selection and power allocation algorithm based on Q learning is proposed to maximize the cumulative capacity.Combining the characteristics of algorithm and underwater environment,a state based on structure-action pair is proposed to describe environmental information effectively and reliably.The state can not only provide decision basis for agent but also avoid dimension explosion caused by using channel state information as state.The simulation results demonstrate the effectiveness of the reinforcement learning algorithm in solving the joint resource allocation problem.It is also verified that the proposed algorithm can maximize the cumulative capacity of the networks.Secondly,for the time-varying and unknown environment with dynamic energy harvesting,a DQN-based joint resource optimization algorithm is proposed.An effective state expression is constructed to better reveal interactive relationship between learning and environment to provide effective learning information.Considering the dynamic and unpredictable energy harvesting environment,a reward function that takes into account the dynamic balance utilization of harvesting energy,which can guide nodes to adjust power strategy adaptively to balance instantaneous capacity and long-term Qo S is proposed.Simulation results verify the superior performance of the proposed algorithm in improving the cumulative network capacity and reducing outages.Finally,aiming at the problem that the DQN-based joint resource optimization algorithm cannot achieve continuous power allocation,a stratification-based DRL framework is proposed to achieve efficient joint decision-making for discrete relay selection and continuous power allocation.Based on the idea of divide-and-conquer,this framework can intelligently track complex and highly dynamic state information to solve coupled optimization problems and realize continuous decision making.As a result,the proposed algorithm can explore a larger solution space and can provide richer and more flexible power strategies,thereby improving system performance.The algorithm can reduce the computational load and provide more effective learning information by introducing the reconstructed state of outdated information with finite dimensions.In order to make the algorithm realize joint optimal decision,a mechanism of same data source training asynchronous execution paradigm is proposed to process the same set of data to output the matched joint optimization strategy.Under the support that the framework can achieve continuous decision-making,the above-mentioned reward mechanism can induce the decision nodes to adjust the optimal power strategy more efficiently and accurately to combat the high dynamic environment.The simulation results verify that the proposed algorithm can improve the learning efficiency and the service quality of the network.
Keywords/Search Tags:Underwater acoustic networks, Resource allocation, Energy harvesting, Deep reinforcement learning, Stratification learning
PDF Full Text Request
Related items