Font Size: a A A

Projection And Viewport Prediction Based 360-degree Video Coding

Posted on:2022-10-05Degree:MasterType:Thesis
Country:ChinaCandidate:L FengFull Text:PDF
GTID:2518306605466804Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the multimedia technology and computer technology developing rapidly,video has gradually become the data form that accounts for the largest proportion of network traffic.Traditional video multimedia applications have been unable to meet people’s needs for the diversity of the video content and the quality of experience.Therefore,360-degree video came into being.Compared with the traditional video,360-degree panoramic video brings people an immersive visual experience and increases people’s interest in exploring video content.At the same time,the data volume of panoramic video is 3 to 4 times that of ordinary video.Panoramic video with a large amount of data is a huge challenge for both transmission and storage.Additionally,there is a serious shortage of bandwidth resources in modern society.In this way,in order to use less bandwidth to transmit more data in a more effective way,research on data compression for 360-degree panoramic video is necessary and vital.In order to compress the data of panoramic video applications that will be encoded and transmitted,this paper studies and analyzes the encoding process of 360-degree video applications.What’s more,this paper proposes corresponding improved algorithm and model for the deficiencies in the projection conversion module and the viewport prediction module during the encoding process.Through conduting intuitive performance analysis and objective simulation comparison experiments on them,the effectiveness and feasibility of the improved projection conversion algorithm and viewport prediction model proposed in this paper are verified.Firstly,this paper conducts a systematic study on the encoding process of 360-degree video applications,and clarifies the important research significance of data compression for panoramic video applications.On this basis,combined with the spatial and temporal characteristics of people’s behavior when watching panoramic videos,the projection conversion method of 360-degree video applications and the neural network-based viewport prediction model are studied in depth.And the two modules are used to perform data compression on the panoramic video which will be transmitted.Secondly,for the two modules,namely projection conversion and viewport prediction,corresponding improved algorithm and model are respectively proposed in this paper on the basis of existing research.The proposed algorithm and model are as follows.1)For the traditional equirectangular projection conversion,there is a serious distortion problem in the two poles and nearby areas of the spherical video,and an improved equirectangular projection conversion method based on a fixed viewing angle is proposed.This method not only saves coding time,but also improves coding efficiency.2)Regarding the problem that the traditional neural network-based viewport prediction models only use the historical gaze trajectory information of users,and do not use the spatial characteristic information of the video image frames,an advanced neural network model based on the combination of the spatial and temporal features of 360-degree video is proposed in this paper.The viewport prediction model can more accurately predict the focus on future video image frames of users.Through the above two improvements,not only the amount of video data to be encoded and transmitted can be greatly reduced,but also more adaptive video quality can be provided to users and the viewing experience of users can be enhanced.Finally,for the algorithm and model proposed in this paper,mathematical analysis and simulation verification are carried out respectively.And the performance of the algorithm and model is evaluated in detail.The experimental results show that the improved equirectangular projection conversion algorithm proposed in this paper has better coding compression performance and less coding time than the traditional equirectangular projection conversion algorithm.Compared with the viewport prediction model based on the historical gaze trajectory of users,the neural network-based viewport prediction model that combines the spatial and temporal characteristics of 360-degree video proposed in this paper improves the prediction accuracy.In summary,the method and model proposed in this paper can not only effectively compress the data of 360-degree video,but also meet the quality requirements of users for viewing panoramic video,which has positive significance for modern bandwidth-constrained networks.
Keywords/Search Tags:360-degree video, Encoding compression, Projection conversion, Viewport prediction, Neural network
PDF Full Text Request
Related items