Font Size: a A A

Reinforcement Learning-Based Relay Selection Strategy For Internet Of Things Communication

Posted on:2021-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:K YangFull Text:PDF
GTID:2428330614463804Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
As one of the three major application scenarios of the fifth generation mobile communication system,the Internet of Things(Io T)has penetrated into all aspects of people's daily lives.Due to the simple,large number,and limited power of sensor node devices in the Internet of Things,it is not suitable for long-distance transmission,and it is impossible to build a fully covered the Internet of Things communication system to achieve the purpose of interconnection.Therefore,this thesis applies the relay cooperative network to the Io T communication system,and proposes to use reinforcement learning to select one or more nodes with good channel conditions from the many relay candidate nodes to participate in the cooperative transmission,on the premise of ensuring system performance,Which can save system overhead and prolong the service life of relay equipment,and avoid the waste of power and synchronization caused by the simultaneous transmission of multiple relay nodes,which satisfies the low power consumption,high reliable transmission and The requirement to increase the effective communication coverage area.Compared with the traditional relay selection technology,the algorithm complexity of reinforcement learning will not increase with the increase of the number of relay nodes,and for different communication standards,it is often only necessary to design different return values,without a lot of theoretical derivation.The design is simple and universal.The main work of this thesis is as follows:1.For the relay cooperative transmission scenario in the Internet of Things,two strategies for single-relay and multi-relay selection based on Q-learning are proposed for the two protocols of amplification and forwarding and decoding and forwarding respectively.First,the action set,state set,state transfer function,action selection strategy and other elements in reinforcement learning are defined and selected,and the received signal-to-noise ratio at the destination is used as the immediate return value,and then the Q-learning algorithm of the time difference method is used to Find out the best relay strategy.The simulation results show that: for optimal single relay selection,the system throughput obtained by the Q-learning algorithm is significantly better than the random relay selection algorithm and the performance advantage is more obvious as the number of relay nodes increases;for multi-relay selection,Under the condition that the received signal-to-noise ratio at the destination is greater than 10 d B,compared with the random relay selection algorithm,the Q-learning algorithm can reduce the number of required relay nodes by more than two on average.2.For the cooperative transmission scenario in the Internet of Things,a solution combining distributed beamforming and relay selection is proposed to establish a target optimization problem based on the maximum signal-to-noise ratio at the receiving end.First,the Q-learning algorithm is used to select more Relays are used for cooperative transmission.In order to reduce the computational complexity of the Q-learning algorithm,this paper uses the quasi-Newton method to quickly solve the upper bound of the maximum signal-to-noise ratio at the receiving end,and uses this value as the immediate return value.Guide the Q-learning algorithm to calculate the best selection strategy.Secondly,after the selected relay is determined,the optimization problem becomes a single-objective non-convex optimization problem,which is made into a convex optimization problem under the condition of positive semidefinite relaxation,and then combined with the dichotomy method and the interior point method to solve Optimal weight and maximum signal-to-noise ratio at the receiving end.Simulation results show that the Q-learning-based multi-relay selection algorithm can achieve close to the best performance and is significantly better than the random multi-relay selection scheme.
Keywords/Search Tags:IoT, reinforcement learning, relay selection, distributed beamforming
PDF Full Text Request
Related items