| Single-cell RNA sequencing(scRNA-seq)is widely used to uncover heterogeneity and dynamics of tissues,organisms and complex diseases,and provides a powerful tool for determining precise expression of tens of thousands of single cells,deciphering cell heterogeneity and cell subsets,etc.Currently,based on human health and the need to diagnose,monitor and treat diseases,related studies of scRNA-seq have been explosively developed and in popularity,each atlas project can release tens of thousands or even millions of single-cell sequencing data.Due to the large amount of biological and technical noise of scRNA-seq data,related analyses are still big challenges.To address the increasing dropout noise observed in scRNA-seq data,and to improve the results of downstream analysis,imputation methods for scRNA-seq data have been emerged.Data imputation is currently one of the hotspot directions for single-cell RNA-seq data research.Most current imputation methods for scRNA-seq data lack stability for downstream data analysis,and even there are some methods that do not substantially improve downstream analysis.Thus,there is an urgent need for imputation methods that more reliably and efficiently improve downstream data analysis to optimize scRNA-seq data.To this end,we presented scRFR,a method for imputation of scRNA-seq data based on recursive feature reasoning network techniques for image inpainting.Firstly this algorithm normalizes scRNA-seq data and preserves highly variable genes;secondly it transforms the expression data of each cell into a gray-scale image;thirdly it generates a mask only for the dropout locations for each cell,then randomly select generated masks to mask the gray-scale image of each cell,resulting in a large training set of images masked by random masks;fourthly,we train the model by defining the inpainting loss function of masks and fine-tune the trained model;fifthly,after the model is trained,the grayscale image of each cell is masked by its own generated mask,and input to the model for imputation;finally,all images are reshaped into gene expression matrices,during which scRFR imputed only original dropout regions,preserving non-zero values in the raw data.To increase algorithm efficiency,scRFR employs the Py Torch deep learning framework and increases computation speed by GPU.With the experiments of six datasets,it was found that scRFR has higher imputation accuracy and lower time complexity compared to some popular scRNA-seq data imputation algorithms,allowing for efficient repair of dropout.In particular,scRFR imputed data can be better clustered for cell typing than raw data,enhancing efficacy of downstream analysis. |