Font Size: a A A

Research On Stereoscopic Image Quality Assessment Based On Visual Information Representation

Posted on:2016-05-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:F QiFull Text:PDF
GTID:1108330503469633Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
To obtain an immersive visual experience, the study of three-dimensional images and videos is important to the development of multimedia technology. With the advanced image processing technology, increasing high-level 3D movies are produced. Some major events can be broadcasted live on high-definition 3D HDTV. Stereoscopic display devices also attract a huge market share. Nowadays, 3D technologies are changing people’s lives gradually. Stereoscopic image quality plays a significant role in human visual perception and three-dimensional experience. Since the current 3D image technology is based on 2D image technology, the same traditional image quality problems, such as distortion, still exist in the acquisition, storage, transmission and processing of stereoscopic image. In addition, a new image quality problem in 3D image is the visual experience of stereoscopic image, such as visual discomfort. To establish the relationship between stereoscopic image features and human binocular visual perception is an attractive research in computer vision, which has a theoretical and practical significance for the application and development of stereoscopic image. Because of the complexity of human binocular visual system, stereoscopic image quality assessment has evolved into an energetic field with the intersection of psychophysical vision, image processing, and computer vision. Because people are the ultimate receiver of stereoscopic image, any image feature induced visual perception can affect people’s judgment on its quality directly. How to describe these features accurately plays an important role in developing stereoscopic image quality assessment which can be consistent with human visual characteristics.Among the study of stereoscopic image quality assessment, with subjective experiments, researchers have found some unique visual characteristics and visual perception factors of human binocular visual system. Several of them have been used in stereoscopic image quality assessment models, and some models have been preliminarily applied to some 3D techniques. However, stereoscopic image quality assessment still faces some problems, such as lacking effective models for binocular visual perception, not focusing on the unique quality problems of stereoscopic image, inadequate consideration of human binocular visual characteristics, low prediction performance and high computation complexity in visual characteristics’ extraction, all of which influence the development, application and extension of 3D techniques. To develop the feature representation model for human binocular visual system, this thesis makes some explorations on the problems mentioned above, and proposes some effective and useful stereoscopic image quality assessment models. The detailed contents of this thesis are listed as follows.First, the previous image information representation models use the distribution probability of each pixel’ value to compute the entropy of image. However, according to the physiological research of HVS, the perceptual unit of HVS is not pixel. How to develop an efficient model of the visual information is an urgent problem in the study of stereoscopic image quality. Based on sparse coding theory, this thesis proposes a novel visual information representation model for stereoscopic image. The proposed model firstly trains a group of visual primitives from stereoscopic image, and regards them as visual perceptual unit. Based on Shannon information theory, the distribution probability of visual primitives is used to calculate the visual information for stereoscopic image, including entropy of left view’s image, entropy of right view’s image and mutual information of both views’ image. Experimental results show that the proposed model is efficient in the visual information representation for stereoscopic image.Second, most of the current visual comfort assessment models focus on the intensity and distribution of disparity, which is one of discomfort-induced factors. To obtain the accurate disparity information between the two views’ image depends on the camera parameters, which are unavailable in many 3D applications. Besides disparity factors, there exists many other discomfort-induced factors. Based on the visual information representation model, this thesis focuses on two discomfortinduced factors, namely, accommodation-vergence conflict and binocular asymmetry, and proposes a stereoscopic visual comfort metric for stereoscopic images. In the proposed metric, the process of accommodation and vergence is represented by visual information, and the factor of binocular asymmetry is represented by image structure features of the two views’ image. Then, the relationship between two discomfortinduced factors and subjective comfort scores can be learned by the training procedure of SVR. Finally, stereoscopic image comfort score is predicted by the trained SVR. Experimental results indicate that the proposed metric is efficient in visual comfort assessment for stereoscopic image, and solve the problem of the dependence of camera parameters in visual comfort assessment metrics.Third, traditional image quality metrics are incapable in the evaluation of stereoscopic image with two views’ different quality level. Most of current stereoscopic image quality metrics are full-reference method, which is limited in some 3D applications. Based on the visual information representation model, this thesis proposes a novel reduced-reference stereoscopic image quality metric to evaluate the quality of stereoscopic image with asymmetric distortion. The proposed metric firstly establishes a public visual primitive set from an image database. Then, the distribution probability of visual primitives’ coefficients is used to calculate the monocular visual information and binocular visual information of stereoscopic image, which represents monocular cue and binocular cue in HVS, respectively. Next, the difference between original and distorted images’ visual information is taken as perceptual loss vector. Then, the relativity between perceptual loss vector and subjective quality scores can be learned by the training procedure of SVR. Finally, stereoscopic image quality is predicted by the trained SVR. Experimental results show that the proposed metric is efficient in stereoscopic image quality assessment and achieves significantly higher prediction accuracy. It also solves a problem of quality monitor in a typical stereoscopic image communication system.Fourth, there exists various visual characteristics in the visual perception of stereoscopic video, one or two of which are taken into account in traditional 2D JND models. Most of pervious stereoscopic video quality assessment(SVQA) metrics have low prediction performance. Focusing on this problem, this thesis proposes an SVQA metric. In the proposed metric, SJND and stereoscopic visual attention are incorporated. Various visual characteristics in the human binocular visual system are described by SJND, which is used to represent the visual perception features of stereoscopic video. Meanwhile, stereoscopic visual attention is represented by the integration of intra-frame’s saliency, inter-frame’s saliency, and binocular saliency. In the visual perception features’ similarity calculation between the reference stereoscopic video and the distorted stereoscopic video, we combine visual perception features with stereoscopic visual attention to improve the prediction performance of the proposed SVQA metric. To evaluate the proposed SVQA metric, a subjective experiment is conducted to establish a ground-truth database. Experimental results show that the proposed SVQA metric achieves better performance than previous metrics.Fifth, the traditional 2D visual saliency detection models are incapable in depth features detection. Some of existing 3D visual saliency detection models take advantage of ground-truth disparity map to extract the depth features with high computational cost. However, the ground-truth disparity map is not always available. To solve the problems of high computational cost and the dependence of ground-truth disparity map, this thesis proposes an efficient 3D visual saliency detection model. The proposed model firstly detects depth saliency map using the Log-Gabor filter from a generated disparity map. Then, the 2D visual saliency map and texture saliency map extracted from the left view’s image are combined with the depth saliency map to fuse into one 3D saliency map by a weighted linear combination(WLC) strategy. The final 3D visual saliency map is obtained by the enhancement of center-bias factor. Experimental results on a public eye tracking database indicate that the proposed model achieves better detection performance among the existing 3D visual saliency detection models, and also solves the problem of high computational time in 3D visual saliency detection.In conclusion, this thesis makes in-deep studies on issues of stereoscopic image quality assessment, such as distortion in stereoscopic video, stereoscopic visual attention, visual information representation for stereoscopic image, discomfort of stereoscopic image and asymmetric distortion in stereoscopic image. The proposed schemes achieve good performance.
Keywords/Search Tags:visual information representation, visual comfort assessment, asymmetric distortion, stereoscopic video quality assessment, stereoscopic visual attention
PDF Full Text Request
Related items