Font Size: a A A

Research On Video Motion Feature Extraction And Its Application In Saliency Computation

Posted on:2022-07-19Degree:MasterType:Thesis
Country:ChinaCandidate:W L ZhangFull Text:PDF
GTID:2518306737957039Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of multimedia information technology and transmission technology,the number of videos generated and obtained by people in daily life has increased dramatically.The human visual system can rapidly selects and focus on the relevant area of the video data,and this selection mechanism is known as the visual attention mechanism.The task of imitating the human visual attention mechanism is generally called saliency detection.Benefit from the continuous progress of deep learning methods,the area of image saliency detection has been developed unprecedentedly and has achieved surprising results.Different from images,videos are the most mainstream and common visual data,which contain rich motion information and inter-frame correlation,and the task of video saliency detection needs to be further studied.Through the analysis of the video datasets,this thesis finds that humans are more likely to be attracted by moving objects in video data.Therefore,how to describe and extract motion information more accurately has become a key issue in video saliency detection.This thesis studies the video saliency detection task based on motion information and inter-frame correlation.The main work is as follows:(1)A video saliency detection model based on motion feature enhancement and hierarchical fusion is proposed.The model contains four subnets: spatial,motion,hierarchical fusion,and timing subnet.The model first uses the spatial subnet to extract spatial information from the video frame,and utilizes spatial information to supervise and enhance the motion features extracted by optical flow in the motion subnet.In order to retain more semantic information,the hierarchical fusion subnet is used to fully integrate spatial information and motion information in a multi-scale manner.Finally,the timing subnet is used to learn the temporal information to further optimize the saliency detection results.The proposed model achieves the best performance on the three mainstream datasets,which proves the superiority and robustness of the model.(2)A video saliency model based on inter-frame correlation and feature propagation is proposed.In order to avoid the redundant calculation caused by the saliency detection frame by frame,this thesis utilizes the low-level spatial features of the video frames to calculate the similarity between frames,and introduces the key frame decision module and the feature propagation module to improve the model based on motion feature enhancement and hierarchical fusion.The key frame decision module uses Pearson correlation coefficient and preset reliability to judge and update key frames,and the feature propagation module uses bilinear interpolation algorithm to generate similar frame features based on the results of optical flow estimation.The proposed model not only maintained excellent performance,but also greatly reduced the model size and parameters,and effectively imporoved the model efficiency to make meaningful exploration for the practical application of the saliency detection method.
Keywords/Search Tags:Video saliency detection, motion feature, layered fusion, key frame decision
PDF Full Text Request
Related items