| The traditional bulk RNA sequencing data reveals the average level of variations of a large number of cells,ignoring cell heterogeneity.The single-cell RNA sequencing data has improved the functional genomics research from the original level of tissue to the level of single cell,which is a new type of data with the characteristics of high dimension,high noise and heterogeneity.Previous methods for analyzing bulk RNA sequencing data cannot be directly applied to single-cell RNA sequencing data,and new methods need to be developed.Gene regulatory analysis is one of the important ways to study single-cell RNA sequencing data.Its analysis tasks mainly include cell subpopulation recognition at the cell level and gene regulatory network construction at the gene level,which correspond to single-cell clustering methods and single-cell gene regulatory network construction methods respectively.In this paper,single-cell RNA sequencing data is used to analyze the methods in gene regulatory analysis to more accurately identify cell subpopulations and construct gene regulatory networks.The main content of this paper is as follows:(1)In order to improve the quality of single-cell clustering,this paper develops a singlecell feature selection method based on convex analysis of mixture(FSCAM).The FSCAM first models the gene set as convex set based on the self-representation learning,and establishes the corresponding relationship between cell-type-specific genes and vertices of convex set by derivation.The feature selection problem is transformed into the vertices recognition problem of convex set.Finally,an improved convex analysis of mixture algorithm is used to identify vertices of convex set.By embedding Partition Around Medoids algorithm,this paper further develops a single-cell clustering method based on the FSCAM(SCC_FSCAM).The proposed methods are tested on eight single-cell RNA sequencing datasets.The results show that the selected features of the FSCAM are superior to other methods in terms of relevancy,redundancy and completeness.The results of feature selection on the embryonic development dataset show that FSCAM method can not only select traditional features,but also explore new cell typespecific genes.Finally,the SCC_FSCAM is used to cluster these datasets,and the results show that the SCC_FSCAM is superior to traditional single-cell clustering methods in terms of cell type number estimation,clustering accuracy and clustering stability.(2)In order to alleviate the underestimation problem of the conditional cell-specific network method(c-CSN)in strongly connected network.In this paper,a partial cell-specific network method(p-CSN)is proposed to eliminate the singularity of strongly connected networks by implicitly considering the direct association between genes.The cell-specific networks constructed by such methods are different from the cell-type-specific networks constructed by traditional methods.They do not require prior cell clustering and can construct a gene regulatory network for a single cell.To verify the validity of the p-CSN,tests are performed on simulated datasets and seven single-cell RNA sequencing datasets.The results on the simulated datasets show that the p-CSN can alleviate the underestimation problem of the cCSN in strongly connected networks,thus reducing the false negative probability of edges in constructed networks.The visualization and clustering results on real datasets show that the gene regulatory networks constructed by the p-CSN have greater differences among different types of cells.This paper also uses the p-CSN to analyze the changes of gene regulatory network topology during embryonic cells development,and the results verify the important role of the spliceosome pathway in early embryonic development.In addition,based on the p-CSN,a single-cell network entropy method(scNEntropy)is proposed to quantify cell state.Results on two embryonic development datasets indicate that there are significant differences between scNEntropy in different types of cells.The pseudo-time of cells can be reconstructed based on scNEntropy. |