In recent years,with the popularization of 5G and the outbreak of the epidemic,more and more offline activities are carried out online,real-time audio and video products have become irreplaceable,and people’s demand for high-quality video is becoming more and more intense.Thanks to the rapid development of GPU and the massive data provided by the Internet,deep learning has made many achievements in many fields.At this time,how to apply deep learning to frame rate up-conversion and video coding has become a key issue of concern in the field.Based on this,this paper mainly studies Web RTC frame rate up-conversion and intra frame prediction technology based on deep learning.The main work can be divided into four parts: First,the overall framework of Web RTC is introduced and the frame rate up-conversion technology is derived according to the frame rate adjustment strategy therein,and the basic neural network is introduced to lay a theoretical foundation for the subsequent algorithm.Secondly,it studies CNN(convolutional neural network),FCN(full convolutional neural network)and U-Net network,and proposes an improved frame rate up-conversion algorithm of U-Net network based on attention mechanism to solve the problems existing in traditional frame rate up-conversion algorithms.The network takes two adjacent frames of the video as input,and uses an attention module to enhance the processing ability of the original U-Net on important features before the encoder performs downsampling operation.After that,by using the decoder and a 1x1 convolution output video interpolation frame,the proposed algorithm improves the image quality index PSNR by 4.9%compared with the traditional algorithm.Thirdly,Res Net(residual network)is studied,and an improved FCN network based on residual error is proposed to solve the problem that the input of the traditional intra-frame prediction algorithm lacks the context information of the encoded pixel block and is applied to the intra-frame prediction.The network takes the encoded pixel block and the adjacent encoded pixel block of the traditional intra-frame prediction algorithm as the network input.The improved residuals learning network output predicted the residuals of pixel blocks and pixel blocks to be encoded,and its specific structure was decoupled into three modules according to function,which could train pixel blocks of different sizes at the same time.Compared with the traditional algorithm,the proposed algorithm achieved2.6% improvement in coding performance and 4.0% improvement in PSNR.Fourthly,this paper designs and implements an audio and video conference system based on the introduced Web RTC technology and combined with the project requirements.The system includes a server and a Web front-end(browser).The service terminal is composed of streaming media service module,signaling service module and Network Address Translation(NAT)transparent service module.The streaming media service module is based on Java Script to design and implement three modules: conference room management,audio and video communication and text communication.Signaling service is implemented based on Web Socket protocol.The open source coturn server is adopted in the NAT passthrough service module.The Web front-end is Google Chrome and Fire Fox.After the completion of the development of each module of the system is tested,the test shows that the whole system function is complete and the performance is stable. |