Font Size: a A A

Research On Visual Saliency Detection Method And Its Application

Posted on:2018-09-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y B TangFull Text:PDF
GTID:1318330536981064Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of science and multimedia technologies,the multimedia data(especially for image data)produced by people grows exponentially in daily life.Vast amounts of image data not only makes our daily life more colorful and convenient,but also brings some new challenges to computer vision techniques.Most of images contain only a small amount of important information,the human visual system is able to find these important information from numerous data for further analysis and process.Computer vision refers to using computer to simulate the the mechanism of human visual system,and makes the computer inspect and understand things like humans.One of the key issues of computer vision is saliency detection.According to the problems of the existing saliency detection methods,this thesis focuses on the specialized research from the simulation of the human visual attention mechanism(HVAM)and robust feature extraction from pixels and regions.Meanwhile,this thesis also introduces the ideas and methods of saliency detection into scene text detection,which can improve the performance of scene text detection and expand the field of application based on saliency detection.For the simulation of HVAM,this thesis proposes a saliency detection method based on superpixel clustering.This method analyzes and simulates the coarse-to-fine process of HVAM by using the computer image processing techniques.To simulate this process,the original image is first segmented into a large number of superpixels,and then a graphstructural agglomerative clustering is used to cluster superpixels until only two classes,so as to obtain a series of intermediate images with consecutive number of clusters(regions).Then an initial salient map is computed based on the boundary connectivity of the regions in the intermediate images,with enforcing the early formed objects in the intermediate images with small number of regions.Finally,the initial salient map is refined to get the final salient map by using the reconstruction errors of sparse coding and the object-bias prior.For robust feature extraction,this thesis proposes a saliency detection method via combining region-level and pixel-level saliency predictions.For region-level saliency estimation,an adaptive region generation technique is developed for region extraction.For pixel-level saliency prediction,a fully convolutional neural network(CNN)is constructed by considering the feature maps from different layers to perform multi-scale feature learning.Finally,a CNN based saliency fusion method is used to dig the complementary information of different salient maps(region-level and pixel-level).To improve effectivity and effeciency,this thesis also proposes another saliency detection method by developing a deeply-supervised recurrent convolutional neural network(DSRCNN).In DSRCNN model,the recurrent connections are first incorporated into each convolutional layer,which can make the model more powerful for contextual information learning.Secondly,the supervisory information is used in different layers to make the model learn more discriminative global and local features,finally,which are fused to make the model perform multi-scale feature learning.For the research of scene text detection based on text saliency,this thesis designs a text-aware saliency detection CNN model,in which different supervisory information is used in different layers and the information from multiple layers is fused for multi-scale feature learning.Meanwhile,the thesis also proposes a text-aware saliency refinement CNN model and a classfication CNN model for text-aware salient regions to improve the performance of text detection.The refinement CNN model integrates the shallow and deep feature maps to improve the accuracy of text segmentation.The classification CNN model can take the images with arbitrary sizes as the inputs due to the fully convolutional neural network.And a new image construction strategy is developed to generate more discriminative images for classification and improve the classification accuracy.
Keywords/Search Tags:saliency detection, superpixel clustering, deep saliency fusion, deeply-supervised recurrent convolutional neural network, scene text detection
PDF Full Text Request
Related items