Font Size: a A A

Research On TD3 Vertical Handoff Algorithm For Heterogeneous Wireless Networks

Posted on:2022-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:S LiuFull Text:PDF
GTID:2518306761960159Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of wireless communication technology and the diversification of communication service types,various wireless access networks emerge endlessly.Traditional wireless networks can handle simple application scenarios.The new generation of wireless networks consists of cellular networks,wireless individual area networks,wireless local area networks and wireless metropolitan networks,which are suitable for complex application scenarios generated by the development of 5G technology.In the future wireless network,traditional switching decision algorithms can no longer meet users' requirements for service timeliness before and after switching when mobile users switch between heterogeneous networks.How to ensure mobile users to switch to the best network reasonably and accurately,to ensure mobile users to switch at the right time,to ensure users before and after switching high-quality service demand experience,has become a hot issue in academic research.Vertical switching technology can be used to solve the above problems,which has indispensable research significance in supporting business continuity and ensuring service quality.This paper makes an in-depth study of the deep reinforcement learning vertical handoff algorithm in vertical handoff The details are as follows:1.In heterogeneous wireless networks,the user to determine when to switch to the right candidate networks is the core of vertical handoff technology.To enable the user to switch to the appropriate candidate network,this paper proposes a vertical handoff algorithm of twin delayed deep deterministic policy gradient(TD3),vertical handoff will be in decision making problems into intensive study the problem of the optimal solutions.Four Qo S parameters including delay,jitter,packet loss rate,and bit error rate are considered according to the actual requirements of different services.Entropy Weight Method(EW)is used to calculate the weights of corresponding network parameters,the reward function of real-time business and non-real-time business is established respectively to calculate the corresponding reward value.,in order to calculate the corresponding reward value.In order to get the full training,neural network training sample database experience pool is established.The gradient descent method is used to train the critic's current network parameters in the network structure of TD3 algorithm.The current network parameters of Actor in TD3 algorithm are updated by gradient rise method,after continuous iteration update,and get the vertical handoff decision according to the user's demand.TD3 algorithm is superior to other classical algorithms not only in terms of throughput and handoff delay,but also in the ability to select the appropriate wireless network for users with different service types,and at the same time reduce the probability of switching failure and the probability of new call blocking.2.In the process of terminal movement,fixed decision interval will lead to frequent terminal switching and waste of network resources.In order to ensure that the terminal in accurate switching time access to the right candidate networks,this paper proposes a depth of intensive study of the maximum entropy(Soft Actor-Critic,SAC).The mobile characteristic of terminal is introduced in the construction of vertical handoff algorithm,vertical handoff process can be divided into network and handoff time two subprocesses,in the handoff time selected subprocesses,SAC algorithm adaptive adjustment decision-making time interval,improve the accuracy of switching decision,reduce in the process of mobile terminal switching times,reduce network resource consumption.In the process of network selector,TD3 algorithm is used for network selection.The simulation results show that SAC-TD3 algorithm reduces the probability of switchover failure and the probability of new call blocking,and reduces the waste of network resources.
Keywords/Search Tags:Heterogeneous wireless networks, Vertical handoff, Reinforcement learning, Twin delayed deep deterministic policy gradient
PDF Full Text Request
Related items