Font Size: a A A

Saliency Detection Based On Visual A Priori And Deep Network

Posted on:2022-11-23Degree:MasterType:Thesis
Country:ChinaCandidate:X W LuFull Text:PDF
GTID:2518306746996279Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Saliency detection is an image understanding task based on visual attention mechanism.From the perspective of human visual system understanding the image,the modeling task sets the pixels in the image that cause human eye interest as the foreground target,and the other pixels are defined as the background area.The research on saliency detection is first carried out in the field of natural images.At present,with the deepening of relevant research,researchers have found that video timing cues are of great significance for saliency detection.Adding timing cues helps to simulate the viewpoint shift of visual attention mechanism in dynamic scenes,so as to improve the final saliency detection effect,Therefore,video based saliency detection has attracted the attention of more and more experts and scholars.Although the saliency detection based on image and video has achieved ideal results,the current saliency detection algorithms still have some deficiencies in the background suppression of complex images and the mining and utilization of timing information.Aiming at the shortcomings of the current significance detection model,this paper carries out relevant research work.The research work and main innovations of this paper are as follows:(1)In the saliency detection of complex images,there are often interference objects with very low discrimination in the background,such as pixel block information with obvious low-level clues but insufficient semantic information,object reflection with very deceptive semantic information,etc.,which makes it difficult to distinguish between the background and saliency target objects.Taking the above challenges as the starting point,this paper proposes an image saliency detection framework based on the combination of central a priori and UNET network,which suppresses the above background noise problem through the combination of central a priori knowledge and advanced semantic information.(2)Video saliency target detection aims to simulate the attention mechanism of human eyes and identify interesting targets or regions in dynamic scenes.At present,the existing video saliency detection methods generally use optical flow network or long-term and short-term memory convolution model to extract time sequence feature information from a macro point of view.However,these methods often ignore the extraction and integration of information details between video frames,resulting in the underutilization of the difference information of continuous frames,resulting in problems such as insufficient space-time consistency and poor edge continuity.Based on this,this paper proposes a video saliency detection method based on timing difference and pixel gradient.By designing a collaborative attention module to integrate timing clues,highlight the position information in the image,and improve the feature fusion problem in different positions combined with gradient information,the timing difference information is effectively mined for saliency detection,The performance of video saliency detection is improved.(3)In video saliency detection,moving saliency objects often attract more attention.Therefore,optical flow network is widely used in video saliency target detection.However,while optical flow information increases motion clues between images,the relatively rough defect of optical flow image edge also brings difficulties to the edge definition of saliency objects in video.Aiming at the defect that the edge area of the detection results output by the video saliency detection network after adding optical flow information is not clear enough,a decoupled video saliency detection framework based on the weighted balance loss function of optical flow and edge is proposed,which is called flow edge net(Fenet).The main body of the framework is composed of optical flow branch network and edge branch network.Through the optical flow branch network,the attention mechanism of human eyes to motion information is simulated,and the edge branch network is combined with timing information to improve edge details.The two branches in the Fenet framework are respectively aimed at the key target subject positioning problem and edge detail description problem in saliency detection,According to the characteristics of the two branches,a dynamic weighted fusion module is designed to effectively fuse the features from the two branches,which not only retains the advantages of the video saliency detection network based on optical flow in target subject location,but also enhances the edge details of the saliency target and improves the accuracy of video saliency detection..
Keywords/Search Tags:Temporal difference, Edge adaptive loss function, Flow Network, Computer vision, Cooperative attention mechanism
PDF Full Text Request
Related items