High-resolution traffic monitoring videos play an important role in the analysis of road conditions and traffic control in intelligent transportation systems.However,the high cost of high-resolution acquisition equipment and limited bandwidth during transmission have limited the improvement of monitoring video resolution.Video super-resolution reconstruction technology is a common method to solve this problem.In this thesis,a spatio-temporal super-resolution reconstruction model is constructed and applied to the traffic scene video super-resolution reconstruction task.The main contributions of this thesis are as follows:1.In view of the deficiency of two-stage super-resolution reconstruction models in considering the spatio-temporal correlation between video frames,this article proposes a one-stage video super-resolution reconstruction model based on group spatio-temporal fusion network(GSFN).The model consis ts of three parts: feature extraction,feature interpolation,and group attentional spatial reconstruction.In the spatial reconstruction part,group attention mechanism is used to guide feature fusion.First,the input features are grouped according to t he perceptual frame rate to fully extract spatio-temporal features and generate group-level features.Then,the group attention mechanism is used to adaptively focus on the group information that is useful for video reconstruction and perform information c omplementarity,effectively fusing group-level features,and improving the poor effect of feature fusion between video frames with occlusion and complex deformation.At the same time,the time groups in the model share a weight,reducing the number of mode l parameters.In the group-level feature fusion stage of spatial reconstruction,three-dimensional residual dense blocks are used to fully extract local information from features,so that the features after attention fusion are further fused to generate re sidual features with rich details,optimizing the effect of upsampling.In the stage of intra-group feature alignment in spatial reconstruction,GSFN uses homography alignment to maintain the geometric shape and spatial structure of the image,achieve fast spatial alignment,and solve the problems of common alignment methods in handling videos with large motion amplitude and complex calculation.2.GSFN is trained and tested on the vimeo-90 k dataset and the vid4 dataset.The test results show that the peak signal-to-noise ratio and structural similarity values of GSFN are 26.39 d B and 0.7867,respectively,which are 1.034% and 0.396% higher than those of the TDAN+EDVR model.3.GSFN is applied to the super-resolution scenario of traffic scenes.Different traffic video datasets are created to train and test the model.The test results show that the peak signal-to-noise ratio of GSFN in sunny highway scenes reaches 25.66 d B. |