Font Size: a A A

Research On Visual Saliency Detection Based On Machine Learning

Posted on:2019-03-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:X F ZhouFull Text:PDF
GTID:1368330548985792Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Visual saliency detection refers to that the human visual system is able to locate and process the most attractive local region quickly and accurately when deal with image/video.The local region is often denoted as salient region.Motivated by this,the researchers try to simulate human visual system using computer technique,and pop out the salient region from image/video.Meanwhile,with the fast development of artificial intelligence,machine learning becomes the hotspot currently.Based on machine learning algorithm,this thesis does some research on image saliency detection from the perspective of performance improvement of single saliency model and fusion of saliency maps,and also does some research on video saliency detection from the perspective of given the ground truth of first frame in video,performance improvement of single saliency model and the usage of deep learning.Correspondingly,this thesis proposes five effective models:1.Improving saliency detection via multiple kernel boosting and adaptive fusion To make a performance improvement for existing image saliency models,this thesis proposes a model to improve performance of an existing saliency model based on multiple kernel boosting and adaptive fusion.Concretely,first,a regional descriptor consisting of regional self-information,regional variance,and regional contrast with local,global,and border context is proposed to describe the segmented regions.Then,regarding saliency computation as a regression problem,a multiple kernel boosting method based on support vector regression(MKB-SVR)is proposed to generate the complementary saliency map.Finally,an adaptive fusion method based on quality assessment is proposed to effectively fuse the initial saliency map with the complementary saliency map and obtain the final saliency map.It can be seen from the experimental results that saliency maps generated by the proposed model achieve better performance than the original saliency maps generated by existing saliency models,and this validates that the proposed model consistently improves the saliency detection performance of various saliency models.2.Adaptive saliency fusion based on quality assessmentTo make a fusion for saliency maps generated by existing saliency models effectively,this thesis proposes a framework to adaptively fuse saliency maps generated using various saliency models based on quality assessment of these saliency maps.Given an input image and its multiple saliency maps,the quality features based on the input image and saliency maps are first extracted.Then,a quality assessment model,which is learned using the boosting algorithm with multiple kernels,indicates the quality score of each saliency map.Next,a linear summation method with power-law transformation is exploited to fuse these saliency maps adaptively according to their quality scores.Finally,a graph cut based refinement method is exploited to enhance the spatial coherence of saliency and generate the high-quality final saliency map.Experimental results with state-of-the-art saliency models demonstrate that the proposed saliency fusion framework consistently outperforms all saliency models and other fusion methods,and effectively elevates saliency detection performance.3.Video saliency detection via bagging-based prediction and spatiotemporal propagationGiven the ground truth of first frame in video,this thesis proposes a spatiotemporal saliency model bagging-based prediction and spatiotemporal propagation.Specifically,a bagging-based saliency prediction model,i.e.an ensembling regressor,which is the combination of random forest regressors learned from undersampled training sets,is first used to perform saliency prediction for each current frame.Then,both forward and backward propagation within a local temporal window are deployed on each current frame to make a complement to the predicted saliency map and yield the temporal saliency map,in which the backward propagation is constructed based on the temporary saliency estimation of the following frames.Finally,in order to ensure spatial consistency,by building the appearance and motion based graphs in a parallel way,spatial propagation is employed over the temporal saliency map to generate the final spatiotemporal saliency map.Through experiments on public challenging datasets,the proposed model consistently outperforms the state-of-the-art models for popping out salient objects in unconstrained videos.4.Improving video saliency detection via localized estimation and spatiotemporal refinementTo make a performance improvement for existing video saliency models,this thesis proposes an improving model to elevate the performance of saliency detection model based on localized estimation and spatiotemporal refinement.The proposed model consists of three key steps including localized estimation,spatiotemporal refinement and saliency update.Specifically,the initial saliency map of each frame in a video is first generated by using an existing saliency model.Then,by considering the temporal consistency and strong correlation among adjacent frames,the localized estimation models,which are generated by training the random forest regressor within a local temporal window,are employed to generate the temporary saliency map Finally,by taking the appearance and motion information of salient objects into consideration,the spatiotemporal refinement step is deployed to further improve the temporary saliency map and generate the final saliency map.Further,such improved saliency map is then utilized to update the initial saliency map and provide reliable cues for saliency detection in the next frame.It can be seen from the experimental results that saliency maps generated by the proposed model achieve better performance than the initial saliency maps generated by existing saliency models,and this validates that the proposed model consistently improves the saliency detection performance of various saliency models.5.Video saliency detection using deep convolutional neural networksWith the successful deployment of deep learning in image saliency detection,this thesis proposes a video saliency model using deep convolutional neural network with three steps including feature extraction,feature aggregation and spatial refinement.Concretely,firstly,the concatenation of current frame and its optical flow image is fed into the feature extraction network,yielding feature maps.Then,the generated feature maps and the current frame and the optical flow image are concatenated and passed to the aggregation network,in which the original information can provide complementary information for aggregation.Finally,in order to obtain a high-quality saliency map with well-defined boundaries,the output of aggregation network and the current frame are fed into the contour snapping network,yielding the final saliency map for the current frame.The experimental results show that the proposed model consistently outperforms the state-of-art saliency models for detecting salient objects in videos.This thesis focuses on image and video saliency detection.Based on the traditional machine learning algorithm,this thesis proposes four saliency models from the perspective of performance improvement of single saliency model,fusion of saliency maps and given the ground truth of first frame in video.These four models effectively improve the performance of saliency detection.Besides,with the development of deep learning and the successful deployment in image saliency detection,this thesis proposes a video saliency model using deep convolutional neural network,which significantly improves the performance of video saliency detection.In this thesis,all the efforts push forward the development of saliency detection from the aforementioned aspects.
Keywords/Search Tags:saliency detection, machine learning, deep learning, performance improvement, adaptive fusion
PDF Full Text Request
Related items