Research On Reinforcement Learning Adaptive Bitrate Algorithm Based On Video User Trajectory Preference

Posted on:2022-09-04

Degree:Master

Type:Thesis

Country:China

Candidate:Q Y Xiao

Full Text:PDF

GTID:2518306536454604

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Streaming video traffic accounts for the largest share of global Internet traffic.In order to achieve smooth video playback under various network conditions,the client video player uses an adaptive bitrate(ABR)algorithm to dynamically determine the bitrate of each video chunk.The goal of this algorithm is to adapt the video bitrate to the underlying network conditions to maximize the user’s quality of experience(QoE).In recent years,ABR algorithm based on reinforcement learning(RL)has been proposed and become the mainstream.However,the reward function used in the existing methods is often pre-set and lack of practical basis,or are not accurate enough,resulting in this kind of methods may provide users with a viewing experience that does not match their expectations.This paper proposes an ABR algorithm based on user trajectory preference,which aims to optimize the QoE of video users from user data.The main work and contributions are as follows:1.The existing video user QoE collection methods require users to give a score of the viewing experience after watching the video.However,it is not easy for users who lack adaptive video stream background knowledge to quantitatively give accurate score to describe their experience quality after a certain viewing,and there may be errors.Considering the above shortcomings,this paper proposes user trajectory preference and collects data,so that users can choose the better one after watching two different videos,without giving a quantitative score.2.A user QoE prediction model based on multi-layer perceptron and user trajectory preference is proposed.This paper describes the structure and training method of this model.After training with user trajectory preference data,the prediction result of this model is closer to the user’s real QoE than the existing reward function used by the RL-based ABR algorithm and the state-of-the-art learning-based QoE prediction method.3.By using the aforementioned QoE prediction model for deep RL algorithm training reward,an ABR algorithm based on user trajectory preference is proposed.This method avoids the blindness of reward modeling in RL,so that the ABR algorithm can be trained in the direction of meeting user needs.The experimental results show that,compared with the latest QoE function,the output of the proposed QoE prediction model has stronger correlation with the QoE data in the existing dataset.The accuracy of this model in user preference prediction is also about 13.6% higher on average.And it has good performance in different RL algorithms.Compared with other RL based ABR algorithms,user trajectory preference ABR algorithm can improve the average user QoE by about16.4%.

Keywords/Search Tags:

Adaptive bitrate, User trajectory preference, Reinforcement learning, Quality of experience, Deep learning

PDF Full Text Request

Related items

1	Research Of Adaptive Bitrate Algorithm Based On Deep Reinforcement Learning
2	Adaptive Bitrate Method For Streaming Media Based On Deep Reinforcement Learning
3	Research On Adaptive Bitrate Video Streaming Technology Based On Broad Reinforcement Learning
4	Mobile Video Transmission Resource Optimization Based On User Preference Perception
5	Research On Visual Sensitivity Aware Adaptive Bitrate Algorithm Based On Reinforcement Learning
6	Sample Efficiency Improvement Method Of Deep Reinforcement Learning And Its Application In Video Bitrate Control
7	Adaptive Bitrate Algorithm Based On Deep Reinforcement Learning
8	Deep Reinforcement Learning Based VR Streaming
9	Research And Implementation Of Stream Media Server Based On Adaptive Bitrate
10	Research On Optimization Of Quality Of User Experience Toward Adaptive Video Streaming In Real Environments