In recent years,with the continuous popularization and widespread application of wireless communication technology and wireless terminal devices,people’s demand for communication speed has been increasing.Collaborative communication technology can improve the spectrum utilization of wireless communication systems without adding any devices,ensuring the reliability of system transmission,and has become a research hotspot in the field of Wireless Sensor Networks(WSN).In a cooperative WSN communication system,node power allocation and relay selection are two core issues.Traditional relay selection algorithms often use instantaneous channel state information as the relay selection criterion,without considering the influence of the current system state and the working state of the relay nodes on the quality of cooperative communication,which cannot be applied to dynamic WSN.Unreasonable allocation of node power not only wastes system power,but also reduces system capacity and reliability,resulting in poor system performance.To address the above-mentioned issues,the main contributions of this dissertation are as follows:(1)This dissertation proposes two relay selection schemes based on Deep Reinforcement Learning(DRL)algorithms,namely the Double Deep Q Network(DDQN)algorithm based on value function and the Proximal Policy Optimization(PPO)algorithm based on policy gradient,for the relay selection problem in cooperative wireless sensor networks.In WSN,the collaborative communication process is first viewed as a state transition process,and the relay selection problem is modeled as a Markov decision model.Then,neural networks are trained based on interruption probability and mutual information,and DDQN and PPO algorithms are applied to adaptively select the best relay from multiple candidate relays under the condition of unknown instantaneous channel state information.Under the same conditions,random relay selection,Q-learning(QL)algorithm,DDQN algorithm and PPO algorithm are compared.The results show that the two relay selection schemes based on DRL proposed in this dissertation can effectively reduce the time to converge to the optimal relay selection policy,significantly reduce the number of iterations,and accelerate the convergence rate.This is because in the DDQN algorithm and PPO algorithm,the source node only needs to select the best relay for collaborative transmission based on the policy,which saves time,reduces computational complexity and cost,achieves higher system capacity,lower energy consumption and interruption probability,and has important application prospects for energy-constrained WSN.(2)For cooperative wireless sensor networks with limited total system power,this dissertation investigates the problems of node power allocation and relay selection.Specifically,convex optimization methods are employed to determine the optimal transmission power of the source node and the selected relay nodes,aiming to maximize the end-to-end signal-to-noise ratio of the system.The cooperative relay process is then defined as a Markov decision process,and a relay selection scheme based on DRL’s PPO algorithm is proposed to adaptively select the optimal relay node.The results demonstrate that,under the same conditions,the performance of the proposed power optimization control scheme outperforms the equal power allocation scheme.Furthermore,compared to a QL-based relay selection scheme,the PPObased relay selection scheme proposed in this paper improves communication efficiency and effectively reduces the required number of iterations for convergence,resulting in higher system capacity and lower interruption probability. |