Font Size: a A A

Optimization Of Deep Reinforcement Learning Algorithm For Wireless Video Transmission Based On Large Deviation

Posted on:2024-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:S A SongFull Text:PDF
GTID:2568307058980859Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
With the continuous development of economy and society and the continuous breakthrough in the field of communication network,the quality of life of residents has been greatly improved.However,the serious energy consumption caused by this has become a problem that has puzzled us for a long time.Energy capture technology comes into being.In recent years,energy capture technology has been an effective technology to solve the energy problem of wireless network.It collects the renewable energy in the natural environment and applies it to the energy supply of electrical equipment.In addition,the development of wireless energy transmission technology makes wireless remote energy supply possible.However,due to the disorder and complexity of wireless network state,it is difficult for this technology to guarantee the quality of network service.Therefore,it is of great significance to improve the utilization rate of captured energy and the efficiency of energy transmission based on overcoming the state randomness of wireless networks.Based on the energy capture technology,this thesis introduces reinforcement learning and deep learning methods,and applies the principle of large deviation in the limit theory to conduct in-depth research on the Markov decision problem embodied in the application of energy capture technology in scalable video transmission.The specific research work is as follows:(1)In this thesis,we first model the video transmission process as a Markov decision process according to previous research results,implement wireless transmission of scalable video using the deep reinforcement learning method,and then introduce the dropout method to avoid the exploration-exploitation dilemma to create an adaptive deep reinforcement learning optimization algorithm.(2)By introducing the principle of large deviation in the limit theory,the reasons for the continuous negative reward in the training process are specifically distinguished.The continuous negative reward is caused by the unreasonable setting of the layer and node in the deep network or the training is finished before completion.In previous studies,we usually assume that the incomplete training leads to the occurrence of continuous negative reward and wait for the model convergence.This leads to the possibility of misjudging the cause of continuous negative rewards.In this thesis,the deep reinforcement learning algorithm for wireless video transmission based on the principle of large deviation applies Cramér theorem to find the optimal boundary of occurrence times of negative rewards,so as to minimize the probability of misjudgment.(3)The scalable video wireless transmission environment is simulated through game 2048,and the performance of adaptive deep reinforcement learning algorithm in game 2048 is evaluated.Finally,compared with traditional wireless video transmission reinforcement learning methods,the adaptive deep reinforcement learning algorithm described in this thesis can obtain faster convergence and higher rewards.The superiority of this method in the optimization of scalable video wireless transmission is demonstrated.
Keywords/Search Tags:Energy capture, Wireless video transmission, Reinforcement learning, Deep learning, Principle of large deviation
PDF Full Text Request
Related items