ROI Extraction For Stereoscopic Video Based On Visual Attention

Posted on:2014-01-02

Degree:Master

Type:Thesis

Country:China

Candidate:G Ye

Full Text:PDF

GTID:2248330395976056

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Nowadays, image and video has increasingly become the main form of Multimedia. How to efficiently locate the image area that viewer needed most from the large-scale image and video data has become a hot problem. Extraction of region of interest (ROI) is one of the key techniques to solve this problem. ROI is an area which can cause the viewer’s interest, and can mostly represent the image’s content. ROI extraction technology is very important and widely used in the filed of image processing and analysis, such as JPEG2000compression coding, target location and identification in machine vision, caption extraction and recognition in the video information, medical image analysis, etc.Human visual system can quickly and accurately focus on a few salient objects in image and video, this objects is called region of interest (ROI, for short), this process is called visual attention. These areas always have a big difference in brightness, texture, color, shape and motion from their surround. Numerous visual attention model have proposed, the most representative is Itti and Koch’s method. This method extracts brightness, color and orientation in image firstly, then use a mechanism called "center-surround" fuse these feature maps into saliency map finally.Three-dimensional (3D) video technologies are becoming increasingly popular in our daily life. As it can provide a high quality experience and immersive feeling compared to traditional2D display, more and more people prefer to it. Due to the introduction of depth information, the traditional2D based image ROI extraction method is not good enough to predict saliency map for stereoscopic video. In this paper, we do a in-deep study on human visual attention mechanism, using a bottom-up approach, propose a3D visual Attention Model based on traditional2D and motion features, and also the depth information.Another innovation in this paper is the features fusion based on ANN. Previous methods often get the final saliency map by a simple linear combination from multiple saliency features, thus have a large deviation with actual human’s data. "Ground truth" is from the eye-tracking data available online as well as our own experimental labeled data, as the input samples of ANN, training a more powerful prior model to predict the ROI which can be more similar to human visual system. Then we can locate the position and size of ROI from saliency map, but it is not stable. In this paper, we use kalman filter to optimize in time domain, letting the position and size of ROI more accurate and stable. The experimental results show that our proposed3D visual Attention Model has a powerful ability to predict the ROI in stereoscopic video.

Keywords/Search Tags:

visual attention, saliency map, region of interest, artificial neural network

PDF Full Text Request

Related items

1	Technologies And Applications Of Visual Saliency Detection For Image Datum
2	Research On Computational Model Of Visual Attention
3	The Research About Region Of Interest Extracyion Method Based On Saliency Map Of Video Image
4	Research On Extraction Of Roi For Image Based On Biological Visual Attention
5	Automatic Natural Image Segmentation Based On Saliency And Interactive Segmentation Algorithm
6	Comfortable Research Of Stereo Image Based On Visual Attention Mechanism
7	Studies On Image Assessment And Video Coding Based On Visual Attention Model
8	The Algorithm Of Visual Saliency Region Extraction
9	Visual Saliency Detection Based On Region Contrast
10	Research On Saliency Region Detection Of Images