In recent years,the target detection system based on computer vision is widely used in civil and military fields due to its high accuracy and robustness of detection.However,this type of method is highly dependent on image prior information and target characteristics,which is poor portability.What’s more,it has a high false detection rate especially when the image resolution is low,the target deformation or occlusion.Therefore,it is necessary to provide the target detection system with some artificially annotated difficult samples,so that the system can learn the same target identification capabilities as people,and a common implementation of artificially labeled samples is using the keyboard,mouse,and other input devices,but this method limits artificial annotation efficiency to a certain extent.Common non-contact annotation methods are based on eye movement or EEG data.The eye movement signal is the time series data of the gaze position of the human eye that changes with time,and the fixation event in it can use to analyze the relevant cognitive information of the person.Fixation Related Potential(FRP)is an electroencephalogram component that is locked to the fixation,which is a reflection of cognitive behavior.Eye movement signals and FRP components evoked by targeted and non-targeted visual attention were significantly different in temporal waveforms.Single-trial FRP can be effectively obtained by synchronously recording and preprocessing the eye movement and EEG data in the process of manually labeling difficult samples.After singletrial FRP and gaze sequence classification,you can obtain the label information during manual labeling,and this cognitive information is useful to improve the efficiency of artificial labeling without a manual manner.In recent years,due to the advantages of automatic learning of latent features in data and end-to-end training,deep learning has been favored by most researchers in the field of visual attention detection.At present,the deep model of visual attention detection based on eye movement lacks the consideration of multi-scale and convolution channel weights.For visual attention detection based on EEG,the deep learning classification model has been widely studied in an experimental paradigm based on Motor Imagery(MI),Steady State Visually Evoked Potential(SSVEP),and only a small number of them based on guide search paradigm.The visual search stimulus paradigm is based on guide information and the design of the classification algorithm is based on traditional shallow machine learning models.Lack of studying in classification based on a deep learning model for gaze-related potentials is a great problem.What’s more,online classification of fixation-related potential is even less.To solve these problems,this paper studies the visual attention detection of eye movement and electroencephalogram,in the free search visual stimulus paradigm,improves the single-trial Classification accuracy with deep learning methods and studies the visual attention online detection system based on single-trial FRP classification.This paper mainly carries out the following four aspects of research:1.Aiming at the problem that the different scale features of the gaze sequence are not fully extracted in the research of visual attention detection based on eye movement,and the importance relationship of each convolution channel is not considered,a multi-scale residual network model Res Fix_SE embedded in the SE module is proposed to provide visual attention detection.feasible solution.Design an experiment to verify the proposed model.First,eye movement signals of subjects browsing images are collected without contact with eye trackers,and gaze sequences are extracted by an adaptive threshold algorithm;second,the proposed multi-scale residual network model was used to classify gaze sequences.Shallow models SVM,k NN,deep models Inception Time,and Conv LSTM were used to compare and verify the validity of the proposed model by the classification AUC results.The experimental results show that the multi-scale residual network model proposed in this paper can significantly improve the classification result of the gaze sequence by learning the features of different scales of the gaze sequence and combining the feature channel weight redistribution module.2.Aiming at the limitation of using the guided search paradigm in visual attention detection based on EEG,a free viewing stimulus paradigm was designed to collect and preprocess the eye movement data and electroencephalogram data to get single-trial fixation related potential.First,use the existing equipment of the laboratory,build an eye moment and EEG synchronous data collection platform,and design free viewing stimulus paradigm experiments,collecting subjects free to search for eye movement and EEG data in the process of observing the picture;secondly,collecting data for pre-processing,through synchronous and offline triggers synchronize the collected eye movement and brain data,extracts effective gaze information and deduct artifacts in EEG,with pass filter,and then get the EEG data segmentation by effectively fixation for baseline correction and threshold screening,extract each subject’s effective single-trial FRP data.3.Aiming at the problem that the single-trial FRP classification model does not fully consider the first-order and second-order statistical features of EEG data,a single-trial FRP classification model integrating convolution and deep Rieman networks is proposed.Convolutional Neural Network(CNN)is a typical network model in deep learning which is a typical method to learn first-order statistical features(such as average).At present,scholars are currently researching the classification of EEG based on convolutional neural networks and achieved some results.The covariance is an estimated form of signal autocorrelation function which is the second-order statistical feature of the signal.The deep Riemannian network can implement automatic extraction and classification of signal second-order features.Some researchers have found that due to the low signal-to-noise ratio(SNR)of the original EEG data,the classification performance of the deep Riemannian network is difficult to give full performance.Based on the above problems,this paper proposes an improved scheme that combines convolutional neural network and Riemann network which improves the accuracy of EEG classification models.The experimental results show that the model that fuses the two features significantly improves the accuracy of single-trial FRP classification.4.A visual attention detection system based on eye movement and EEG was built and verified by experiments.The implementation process of this paper’s online detection system is as follows: under free viewing visual stimulus paradigms,firstly,simultaneously collect the eye movement and EEG data during online experiments,and obtain effective single-trial FRP signals by preprocessing the collecting data.Secondly,the trained EEG classification model which fuses a convolutional neural network and a deep Riemannian network is used for the single-trial FRP signal;finally,the positioning of the target visual attention is located in the image with the position information corresponding to the fixation event.In addition,the visual attention system based on eye movement and EEG is verified and evaluated,and the results show that its visual attention detection results are accurate and robust. |