Font Size: a A A

Adaptive Bitrate Algorithm Based On Deep Reinforcement Learning

Posted on:2022-12-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y H ZouFull Text:PDF
GTID:2518306764963029Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
Video is one of the main information carriers nowadays.With the rapid growth of video traffic,video streaming is emerging since the traditional text transmission technology cannot meet the demand.In a classic streaming system,the server divides the video into chunks of equal duration,encodes and stores them at different bitrates.Then the client requests video chunks of appropriate bitrate from the server as needed.The adaptive bitrate algorithm is the key technology to ensure high-quality video streaming.It focuses on improving user experience quality by dynamically selecting bitrates based on the existing information.There have been a lot of researches on adaptive streaming,but they are insufficient because of poor universality of bandwidth estimation methods,difficulty in parameter setting,etc.Moreover,there is still a big gap between the performance of the existing algorithms and the theoretic bound.To address the above challenges,two bandwidth estimation schemes are designed in this thesis,the results of which are the basis for the decision-making of the subsequent adaptive bitrate algorithm.The bandwidth estimation scheme based on GRU network utilizes the GRU network to predict the future network bandwidth according to the bandwidth information collected in the past.The bandwidth estimation scheme based on probability and statistics assumes that the network bandwidth is a piecewise stationary Gaussian process.Based on the online change point detection algorithm,it deduces the mean and variance of the network bandwidth in the future by Bayes' theorem and the statistical properties of Gaussian distribution.After obtaining the bandwidth estimation results,an adaptive bitrate algorithm based on deep reinforcement learning is proposed,where the information such as bandwidth estimation results,currently playback buffer occupancy,size of the next video chunk is the state,the user quality of experience is the reward,and the selected bitrate is the action.Then with a similar architecture,the algorithm obtains the bitrate adaptive policy through the neural network.The experimental results show that the proposed adaptive bitrate algorithms achieve better performance compared with the state-of-the-art algorithms.The dual-policy idea is used to further improve the performance of the adaptive bitrate algorithm.According to the mean and variance of the network bandwidth,the network conditions are classified into strong network and weak network.Then,aggressive policy is trained for strong networks and conservative policy is trained for weak networks through online learning,respectively.The experimental results show that the dual-policy can effectively improve the performance of the adaptive bitrate algorithm.Compared with the state-of-the-art algorithms,the average quality of experience of the dual-policy adaptive bitrate algorithm is improved by 2.52%?21.98%.
Keywords/Search Tags:Adaptive Bitrate Algorithm, Deep Reinforcement Learning, Bandwidth Estimation, Dual-policy
PDF Full Text Request
Related items