Vision is the primary way for human to perceive the world.The video plays an important role in people’s life.However,video quality will inevitably decline in the acquisition,transmission and processing,which will affect people’s perception and cognition from visual information.Therefore,it is of great significance to design an effective video quality assessment(VQA)method to express visual information and optimize video quality.As the ultimate receiver of visual information,the human visual system(HVS)is diverse and complex,it is difficult to accurately reflect the perception of evaluated video by existing mathematical models.Therefore,by collecting the electroencephalography(EEG)induced by subjects while watching videos,this thesis analyzes the relationship between the potential changes of the EEG signals and the visual characteristics,according to the perceptual differences of EEG signals induced by distorted videos with different spatial complexity,different distortion ranges,and different color saturation.At the same time,the multi-scale spatial and temporal characteristics of the quality-aware EEG signals are extracted based on convolutional neural networks(CNN)and bi-directional long short-term memory networks(BLSTM),to establish a quality classification model.The main research contents of this thesis are as follows.1)A video perception quality assessment method based on EEG signals and spatial distortion is proposed.The global and local distortion of the video will affect the human’s perception of video quality in different ways.At the same time,different video contents have different complexity of spatial information,which will also affect the human’s perception of video quality.Therefore,based on the spatial distortion characteristics of video and EEG signals,this thesis studies the influence of two factors,the spatial complexity of video and the range of video distortion,on the perception of video quality.By collecting the EEG signals,which are generated by subjects who watch videos with different spatial complexity and different distortion ranges under different distortion degrees,the EEG components that characterize the perception of video quality are analyzed and classified.Then,the subjects’ behavioral data and EEG signals classification results are correlated to obtain consistent video quality assessment results between subjective scores and objective indices.The experimental results show that different spatial complexity has an impact on subjects’ perception of video distortion.Subjects are more likely to perceive video distortion with low spatial complexity.And the size of distortion range has an impact on subjects’ perception of video distortion and subjects are more likely to perceive video distortion with large distortion range.2)A video perception quality assessment method based on EEG signals and saturation is proposed.Videos with different saturation will affect human’s perception of video quality,which will affect people’s video quality assessment results.Therefore,based on EEG signals,this thesis studies the influence of different video saturation on video quality perception.Firstly,the saturation of the selected video is reduced,and the quality levels near the perception threshold are selected.The EEG signals are generated by the subjects who watch the video with different saturations and different distortion degrees,and the EEG components which represent the perception of video quality are analyzed and classified.Behavioral data of the subjects are associated with the classification results of EEG signals to obtain consistent video quality assessment results between subjective scores and objective indices.The experimental results show that video saturation has an impact on video distortion perception,and subjects are more likely to perceive video distortion with low video saturation.3)A video quality classification method based on EEG signals and spatial-temporal multiscale neural network is proposed.Because the existing deep learning methods only extract EEG features of single dimension,there is a lack of multi-scale feature extraction for the spatial-temporal characteristics of the P300 component which represents visual perception in EEG signals.Therefore,in this thesis,the multi-scale convolution in the spatial feature extraction network based on CNN is used to extract multi-scale spatial features.The EEG signal samples are sliced in time domain,and multi-scale time features are extracted through the time feature extraction network based on BLSTM.By fusing multi-scale temporal and spatial features,the final feature representation of EEG signals is obtained.At last,the EEG classification results of quality assessment are obtained.The experimental results show that the proposed method can extract the EEG signal representation of multi-scale spatialtemporal features in line with the characteristics of human visual perception and cognition,and can improve the classification accuracy of video quality. |