| Exploring functional brain activity and understanding cognitive mechanisms have been one of the major research hotspots in neuroscience.Functional Magnetic Resonance Imaging(fMRI),a non-invasive method for functional imaging of the whole brain,has played an important role in exploring functional brain activity.In recent years,the superior performance of deep neural networks in various domains has facilitated the combination of deep learning and fMRI research.Supervised learning,one of common forms of deep learning,requires large amounts of labeled data for training,which conflicts with the small size of fMRI datasets.In addition,it is costly and requires strong expertise to label large amounts of data.The emergence of self-supervised learning avoids the need for data labeling in deep learning.Therefore,there is an urgent need to build a self-supervised learning framework applicable to small fMRI datasets for cognitive function studies.In this paper,we focus on the construction of a self-supervised learning framework,model pre-training,migration applications to small datasets,and interpretability issues of deep learning in fMRI decoding as follows:First,a self-supervised learning model that matches the intrinsic characteristics of the data is the most important part to achieve effective feature extraction from neural networks.And fMRI data values at specific time points are meaningless,while their changes over time can reflect the functional activity characteristics of the brain.Considering the continuity of neural states within the human brain,i.e.,the correlation between two neural signals that are close in time is greater than that between neural signals that are far apart in time.In this study,we propose that the correlation between the middle and the end part of a fMRI sequence is greater than the correlation between the beginning and the end part,and thus develop a contrast loss function within a single fMRI data based on the time domain.The model was pre-trained using five tasks on the Human Connectome Project(HCP)dataset of 1034 participants.The pre-trained model was validated with a series of downstream tasks,and convergence was achieved by fine-tuning the model on the Motor and Relational task classifications using only 12 participants’ data.Finally,we transferred the pre-trained model to a small dataset of unprocessed fMRI containing 30 participants,achieving an accuracy of 80.2±4.7%in face and house stimulation tasks classification.In addition,to explore the interpretability of deep learning in fMRI decoding,this study further validated the performance of the model using the Multiple Domain Task Battery(MDTB)dataset.Seven functional brain networks were used as inputs and the performance of decoding cognitive tasks are outputs.The results show that the performance of the model using visual network as input is comparable to that using whole-brain as input,while the decoding performance using the limbic network is almost at chance level.To further validate the results,the study was evaluated on the HCP dataset using the same paradigm,and the results were consistent with the MDTB dataset.In summary,this study constructs a self-supervised framework based on the time domain and validates the effectiveness of the proposed method on fMRI data by transferring to small fMERI datasets.At the same time,it was found that the correlation between regional fMRI activity and cognitive tasks can be analyzed by using the fMRI brain functional network data as inputs to the neural network,providing a new method on the interpretability of deep learning in fMRI decoding. |