Font Size: a A A

Research On Video Saliency Detection Algorithm

Posted on:2021-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:D LuFull Text:PDF
GTID:2428330605964870Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The saliency detection simulates the attention mechanism of the human eyes to identify areas of interest in a real scene.It is closely related to people's daily life,so it has attracted considerable attention.It is suitable for many applications machine vision,multimedia,and entertainment,such as image segmentation,image retrieval,video surveillance,object detection,and even facial sketch synthesis.It has been widely used in various fields of computer vision,including video segmentation,video compression,video summary,and even facial sketch synthesis.Traditional image saliency detection algorithms have been successfully developed,but these methods are not satisfactory in mapping video saliency detection tasks.On the one hand,image saliency detection mainly depends on the calculation of contrast,gradient,and texture.However,these features are The video scene is constantly changing.On the other hand,compared with image saliency detection,the salient area of the video pays more attention to the information between consecutive frames.And salient target detection not only simulates human visual attention mechanism,but also depends on human cognition of object structure and motion cues.At present,image and video detection technology based on deep learning is developing rapidly and has achieved good results in many practical tasks.Based on deep learning theory,this paper proposes two different video saliency detection algorithms.The main work and innovation are as follows:First,in order to enable neurons to adaptively adjust their perception and size,a multi-scale spatial attention module is designed in this paper.Map scale information to attention features and spatial features respectively.Secondly,in order to emphasize the target dynamics clues more and explore an effective feature to determine the salient target and capture depth time information at the same time,this paper redesigned the salient displacement sensing module.This method of attention uses separate labels for training and learning.According to the low-resolution and high-confidence characteristics of deep features in the network,the attention model can effectively predict dynamic salient regions,thereby guiding the entire network to discover complete salient regions.This multi-tag implementation method effectively and quickly captures motion cues,accelerates network convergence and can significantly improve performanceThen,an interactive group semantic attention model is established,which effectively expresses the local and global information of spatial salient features.Finally,this paper designs a space-time residual convolution LSTM network to extract the time correlation of significant targets.The initial salient target location of the spatialnetwork positioning,and the accuracy of the positioning determines the effectiveness of the temporal network in spreading salient regions.
Keywords/Search Tags:Multi-scale, attention, deep learning, group semantics, convolution LSTM
PDF Full Text Request
Related items