| With the development of single-cell sequencing technology,the emerging single-cell multi-omics data are widely used in biomedical research.Using deep learning algorithms to mine single-cell multi-omics data carrying information can not only explore cell heterogeneity and complexity,but also provide an important perspective for understanding biodiversity and identifying rare cells.It brings a qualitative leap for humans to study complex genetic sequences and explore mysterious biological genetic research.Integrating single-cell omics data for analysis and research provides an important research basis for tumor analysis and prevention of malignant cell mutations,biopharmaceuticals and drug treatment,and has become a hot spot in the field of biological research.In this study,the deep learning framework was used to carry out analysis and research on single-cell omics data.The main research contents as follows:1.We proposed a single cell classification algorithm based on semi-supervised learning called sct AGCN(Single cell transcriptomics data classification via autoencoder and graph convolutional network).Firstly,Autoencoder(AE)is used to overcome the high-dimensional problem of data and project single cell transcriptome high-dimensional data into low-dimensional space.Secondly,the mutual nearest neighbor algorithm(MNN)is used to find k nearest neighbors of each cell to construct the adjacency matrix.Finally,the graph convolutional neural network is used as a classifier for single cell classification.In this paper,five datasets collected by cross-sequencing and cross-species were used to evaluate the performance of the model.The results show that sct AGCN can effectively extract single cell information and is superior to other single cell classification methods.2.We proposed a universal framework for the integration of single-cell multi-omics data based on graph convolutional network(GCN-SC).Among the multiple single-cell data,Single-cell multi-omics integration analysis usually selects one data with the largest number of cells as the reference and the rest as the query dataset.Firstly,sc Impute is used to impute the single-cell transcription data to remove the false zero value generated by the cell sequencing process.Secondly,Mutual nearest neighbor(MNN)is used to identify cell-pairs,which provided connections between cells both within and across the reference and query datasets.The GCN algorithm used the hybrid graph constructed from these cell-pairs to adjust count matrices from the query datasets through graph convolutional networks.Finally,dimension reduction is performed by using non-negative matrix factorization(NMF)on the adjusted matrices and the matrix from the reference to fulfill the integration.By applying GCN-SC on six datasets,we showed that GCN-SC can effectively integrate sequencing data from multiple single-cell isolation technologies,species,sequencing methods or different omics,which outperforms the state-of-the-art methods,including Seurat,LIGER,GLUER and Pamona. |