At present,stereo images have been widely used in entertainment,medicine,military and other fields because of their high immersion.High quality stereo images can bring viewers a better visual experience.However,in the process of stereo image imaging,many factors will affect the quality of stereo images,resulting in visual discomfort or visual fatigue.Therefore,obtaining high-quality stereo image has always been the core problem in the field of stereo imaging,and how to judge the quality of stereo image is the key to solve this problem.A good stereo image quality assessment method can not only judge its quality,but also guide the improvement and optimization of relevant stereo imaging technology.In order to construct a stereo image quality assessment model in line with human visual subjective feeling,based on the detailed introduction of human visual information perception process,this thesis proposes a data preprocessing scheme and two objective evaluation models based on visual pathway perception.The main contents of this thesis are as follows:(1)In order to reflect the attention of human eyes to different regions of the image,an image clipping scheme based on visual saliency guidance is designed in this thesis.This scheme can not only segment the image according to the importance,but also solve the problem of loss of interactive information between different regions in traditional image clipping,and provides a feasible scheme for data preprocessing.(2)The processing of visual information in visual path is a complex process.This thesis deeply excavates the visual information processing mechanism of different sub regions in visual path,and designs two branch convolutional neural network to simulate the functions of different sub regions of visual path,so as to realize the functions of "what" path and "where" path.This method is called fine-grained implementation of visual path.In addition,this method does not simply use 2D convolution to extract features after stitching the feature images of the left and right views in the channel dimension,but uses 3D convolution to extract features and fuse information of the left and right views.(3)This thesis is no longer limited to the information processing mechanism of each sub region of the visual path,but abstracts the visual path into a multi-level and multi-scale feature extraction fusion with convolution network from a macro point of view,which is called the coarse-grained implementation of the visual path.The network structure of this method consists of visual intersection and lateral geniculate body,multi-level multi-scale feature extraction fuser and regression network.The visual intersection and lateral geniculate body are used for binocular information exchange,and the multi-level multi-scale feature extraction fuser is used to extract visual fusion features of different levels and scales.Among them,the multi-level multi-scale feature extraction fusion device is mainly composed of a convolutional layer and a two-way perceptual alignment attention module.Experiments are carried out on public data sets.The results show that the proposed two methods are more in line with the subjective perception of human vision than other methods,and have good generalization ability. |