Font Size: a A A

Research On Vision Attention Model And Its Implement In ROI Image Compression

Posted on:2009-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:L ChenFull Text:PDF
GTID:2178360272475025Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
The model of saliency-based vision attention proposed by Itti is one of successful vision attention models and it works as follow: the Gaussian difference function is used to work as receptive field of retina ganglion cell to get the colors, intensity and orientations to form the conspicuity maps, and then a local Winner Take All (WTA) neural network is used to compute a saliency map from the conspicuity map. It works well to detect the object from the natural scenes. However, the shape information is ignored in Itti's model. It shows that that eye-system has the character that can identify object's shape well, based on some eye-system research. The more accurate vision attention region will be gotten, if the object's shape is used. Thus, this paper proposes two improvements as follow:1. There is a module added to get the object's shape information in our model. The salience area generated by Itti's model is expanded a reasonable extension, and then the object's shape is figure out. After that the visual attention area is obtained according to the shape in the salience area. Because the objects are always quite different from the background, there are usually full shapes which can be gotten from the map. Thus if we use the shape information to get the attention area, the result will be better.2. It is necessary for us to get a much clearer object's shape, because the shape is a key point to get a better attention area. But the Gaussian Pyramid used in Itti's model smoothes the image and the object's shape at the same time. It is not good news for the processing followed. In this paper, another Gaussian Pyramid is used. The Gaussian core's shape, size and orientation are adaptive according to the local image structure. Near a region boundary, the rectangular core becomes narrow and small and aligns with the boundary. A better shape will be gotten because of that.It shows in our experimentation that much more accurate salience object can be focused on by our model, especially when the object's intensity, color and orientation are similar to the background, but the slender shape can be figure out. Meanwhile the computation is not too complex.The visual attention model can be used in many ways. In this paper, the salience map is used as the ROI mask to compress image. We use 5/3 wavelet to transfer the image linearly, and then figure out the ROI mask fitted for the wavelet coefficient. After that, we use SPIHT (Set Partitioning in Hierarchical Trees) to classify and sort the masked wavelet coefficient to form the embedded coding, and complete the image compression. The advantage of this ROI image compression model is that it can work without human control.
Keywords/Search Tags:Visual attention, Target detection, Saliency map, ROI image compression
PDF Full Text Request
Related items