Research Of Adaptive Bitrate Algorithm Based On Deep Reinforcement Learning

Posted on:2023-05-21

Degree:Master

Type:Thesis

Country:China

Candidate:L Yi

Full Text:PDF

GTID:2568306785964529

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Recent years have seen the rapid growth of HTTP-based video streaming.At the same time,viewers’ demand for video quality is gradually increasing.The adaptive bitrate(ABR)algorithms are used by video players to improve the quality of experience(QoE)for users.Because of the problems of frequent buffering,video freezing,low image quality,or inaccurate prediction of network throughput in existing ABR algorithms,this thesis uses deep learning and reinforcement learning(RL)methods to focus on the efficiency of ABR algorithms.The main work is as follows:(1)Aiming at the problems of large fluctuations in reward value and slow and difficult convergence of the algorithm when training neural networks with existing RL methods,an ABR algorithm based on deep reinforcement learning(NABR)is proposed.At first,NABR limits the update range of the old and new policies to avoid convergence difficulties caused by the large difference in update range;secondly,NABR uses the baseline function to reduce the policy gradient variance;at the same time,the trust region method is used to find the optimal ABR policy;finally,NABR adds entropy loss to the policy network to encourage the agent to explore randomly to increase the cumulative reward.The experimental results show that,compared with the existing methods,NABR has a faster convergence speed,more robustness,and can further improve the user’s QoE.In addition,the effects of different neural network structures on the effects of the RL-based ABR algorithm are analyzed through experiments.(2)The existing RL methods require a large number of training samples and cannot converge quickly,resulting in weak generalization of the learned ABR algorithm and an inability to adapt to different network bandwidths.A large amount of policy gradient variance will be generated when calculating the policy gradient,causing convergence difficulties and other problems.A meta-learning-based ABR algorithm(LABR)is proposed.LABR uses the meta-learning method to train the RL policy network and uses a small number of samples to learn an optimal loss function,so LABR only needs a small number of task samples to quickly converge and be more efficient.stable,thereby improving the generalization of the ABR algorithm and further improving the QoE.Finally,the effectiveness of the LABR algorithm is verified by experiments.(3)For the existing ABR algorithms with fixed QoE parameters,RL generates the ABR algorithm model by training a fixed reward value,resulting in an increase in one index and a decrease in the other.For example,the weight coefficient of improving video quality will cause the video freeze time to increase;increasing the weight coefficient of the freeze time will reduce the video quality and other problems.An ABR algorithm(BABR)based on the constrained Bayesian optimization method is proposed.BABR uses the constrained Bayesian method to optimize the QoE.Various weights can improve the video quality and reduce the freezing time so that the video quality and the freezing time and other parameters can achieve the best combination.The experimental results show that,compared with the existing methods,BABR can achieve a better balance in the weights of various indicators of QoE and finally achieve a higher QoE.(4)Research the deployment and application methods of NABR,LABR,and BABR algorithms in adaptive streaming media systems.The NABR,LABR,and BABR algorithms are respectively deployed on the video player,and the ABR algorithm requests the video stored on the Linux server through the HTTP protocol to verify the validity of the algorithm.The experiments are evaluated in 4G and Wi Fi network environments,respectively.The experimental results show that the QoE metric of the NABR algorithm is improved by 3.8%–9.4%.

Keywords/Search Tags:

adaptive bitrate algorithms, quality of experience, reinforcement learning, meta-learning, constrained bayesian

PDF Full Text Request

Related items

1	Research On Adaptive Bitrate Video Streaming Technology Based On Broad Reinforcement Learning
2	Adaptive Bitrate Method For Streaming Media Based On Deep Reinforcement Learning
3	Research On Reinforcement Learning Adaptive Bitrate Algorithm Based On Video User Trajectory Preference
4	Research On Visual Sensitivity Aware Adaptive Bitrate Algorithm Based On Reinforcement Learning
5	Sample Efficiency Improvement Method Of Deep Reinforcement Learning And Its Application In Video Bitrate Control
6	Adaptive Bitrate Algorithm Based On Deep Reinforcement Learning
7	Mobile Video Transmission Resource Optimization Based On User Preference Perception
8	Research On Platform Independent Adaptive Streaming Media Transmission Based On Reinforcement Learning
9	Adaptive Streaming Media QoS/QoE Management Based On Reinforcement Learning
10	Algorithm Research On Knowledge Reuse And Generalization Ability Of Meta-learning