Extraction Of Salient Region In Video Based On Deep Learning

Posted on:2019-11-23

Degree:Master

Type:Thesis

Country:China

Candidate:X Wei

Full Text:PDF

GTID:2428330590467425

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

In complex scenarios,human can quickly identify regions of interest and understand scenes,which is based on the visual attention mechanism of human visual system.Visual information is mainly derived from the received image or video data.When we look at an image,the human eye is more likely to locate the regions that stimulate the vision,which is a salient region.The introduction of human visual attention mechanism in image processing of computer can not only filter useless data and improve computing efficiency,but also has important application value in many computer vision tasks.Extraction of salient region in video is to extract regions of interest from video frames by simulating human visual attention mechanism.In recent years deep learning network has a good performance in object detection and image classification.It originates from that deep learning can effectively distinguish between complex features and feature extracted is more suitable for the target task.However,traditional methods mainly select hand-crafted features which may not match the target task.Therefore,the proposal of deep learning will greatly promote the development of extracting salient region.According to the research of extraction of salient region in video and frontier technology,two algorithms are proposed based on deep learning in the paper.Firstly,one method is proposed based on the fusion of coarse and fine features,which achieves the learning of coarse global information by dual-stream convolutional neural network and the refinement of details by recurrent connections.The fusion process is completed through the network cascading.Secondly,conditional generated adversarial network is designed to solve the problem of lack of enough datasets for training.The loss function of generated network is summed between adversarial loss and content loss.The content loss is calculated by the cross entropy of predicted saliency map and groundtruth.This paper adopts three evaluation metrics to compare the proposed models,Presicion-Recall curves,F-measure and AUC,from two aspects of qualitative and quantitative analysis.For the method based on the fusion of coarse and fine features,the precision is increased by 10.76% after adding the recurrent connections for learning refined features.For another algorithm based on conditional generated adversarial network,the precision is increased up to 15.24% when combining the discriminative network for adversarial training.Compared with six benchmark methods,our algorithms achieve a state-of-art performance.In particular,the first method reaches 86.96% in precision and 86.72% in recall.

Keywords/Search Tags:

video saliency, region extraction, convolution neural network, generative adversarial network

PDF Full Text Request

Related items

1	Research And Implementation Of Entity Relation Extraction Based On Generative Adversarial Networks
2	Research On Image Retrieval Based On Hashing And Generative Adversarial Networks
3	Research And Design Of Underwater Optical Image Restoration Method Based On Generative Adversarial Network
4	Design Of Video Super Resolution Reconstruction Algorithm Based On Generative Adversarial Network
5	Image Denoising Based On Deep Convolution Neural Network Method Research And Application
6	Research On Image Haze Removal Algorithm Based On Generative Adversarial Network And Hardware Acceleration
7	Research On Image Saliency Detection Algorithm Based On Information Fusion
8	A SAR Image Change Detection Method Based On Semantic Convolutional Neural Network And Generative Adversarial Learning
9	Research And Application Of A Generative Adversarial Algorithm And Its Impact On Adversarial Examples
10	Research Of The Cross-domain Image Understanding Based On Generative Adversarial Neural Networks