Font Size: a A A

Research On FMRI Visual Information Deep Neural Network Encoding Model Based On Feature Fusion

Posted on:2021-04-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y B CuiFull Text:PDF
GTID:2370330647957266Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
It is demonstrated that vision information accounts for eighty percent of the obtained by human beings,and the brain visual system has the advantages of strong robustness and high efficiency.How to understand and simulate the brain visual system is a popular issue in the intersecting field of computer science and neuroscience.Functional magnetic resonance imaging(f MRI)with good temporal and spatial resolution has become one of the main non-destructive methods to study the human brain.The visual encoding model based on f MRI is a mathematical model based on the visual perception mechanism of the brain,which simulates the visual information processing of the brain and predicts the neural response of the brain when the human eyes receive visual stimulation.The inspiration for the construction of deep neural networks(DNN)comes from the brain visual system at first.Because of its nonlinear feature expression ability,it has a unique advantage in the field of visual encoding model.Due to the complexity of brain visual information processing,how to construct DNN to realize the extraction and fusion of all kinds of nonlinear visual features is a key problem to improve the prediction performance of the visual encoding model.In this paper,different visual encoding models are studied and proposed combined with the nonlinear feature extraction and learning characteristics of DNN according to the characteristics and differences of information processing in different visual region of interests(ROI),from the aspects of “manual feature selection”,“manual feature fusion with depth network feature fusion” and “depth network feature fusion and visual attention feature fusion” respectively.The main study work is as follows:(1)A Gabor wavelet pyramid(GWP)low-level visual encoding model based on dense Gabor feature is proposed.The local changes in spatial frequency,direction,and position of visual stimuli in natural images are very complex,and it is not clear whether the classical GWP model is fully expressed in feature space.Hence,we proposed a Dense-GWP visual encoding model in this paper.First of all,we analyzed the effects of the dense expression of three features(spatial frequency,direction,and position)on the encoding performance of the GWP model.The experimental results showed that the encoding performance of the primary cortex of the GWP model could be developed by the densification of the direction and location features.Compared with the original GWP model,the prediction accuracy of V1,V2,and V3 of the Dense-GWP model is improved by 5.43%,4.87%,and 1.1%,respectively.(2)A low-level visual encoding model based on Gabor Net-VE is proposed.The encoding model based on manual features has good interpretability,but the accuracy of the model is insufficient,while the encoding model based on deep network features is the opposite.To integrate the advantages of the two features,Gabor Net visual encoding(Gabor Net-VE)model was constructed based on the fusion of Gabor features and deep network features.Gabor Net-VE is a lightweight end-to-end regression model composed of a Gabor convolution layer,two conventional convolution layers,and a fully-connected layer.The key is to replace the conventional convolution kernels with parameter-learnable Gabor convolution kernels in the first convolution layer.The experimental results showed that the encoding model achieved the best prediction performance for low-level visual areas in the comparison model.Besides,the visualization results showed the regularity of visual features and estimated receptive field changes with visual ROIs.Therefore,the Gabor Net-VE model not only improves the prediction performance but also has some advantages in biological interpretability.(3)An encoding model for middle and high-level visual areas based on Res Net-CBAM was proposed.The attention mechanism is ubiquitous in the visual system of the brain,which is particularly important for information processing in the middle and high-level visual ROIs.Most of the current encoding models for the middle and high-level visual areas adopt DNNs with the same structure as encoding models for the low-level visual area,ignoring the influence of visual attention mechanism,which is an important factor for their low prediction performance.In this paper,the attention mechanism is introduced in the middle and high-level visual encoding model.The multi-dimensional convolutional block attention module(CBAM)was embedded in the deep residual network(Res Net),and then was pre-trained to obtain the Res Net-CBAM network.Then the feature space was constructed based on the middle layer features of Res Net-CBAM,and the voxel response was linearly predicted by ridge regression.The experimental results showed that the encoding model based on Res Net-CBAM could significantly improve the encoding performance of V4(middle visual area)and PPA(high-level visual area)compared with the encoding model based on Res Net.
Keywords/Search Tags:Visual encoding model, functional magnetic resonance imaging, feature fusion, deep neural network, Gabor wavelet, attention mechanism
PDF Full Text Request
Related items