| In the era of big data,causal inference is a key method for understanding the phenomena and patterns behind the data.However,how to perform reliable causal inference through data analysis in various fields remains a challenge.In the field of bioinformatics,backward reasoning of gene regulatory network based on gene sequencing data is one of the most typical causal inference problems.The construction of gene regulatory network is helpful to understand the function of genes and identify key regulatory factors in the process of gene expression,so as to better explain the dynamic expression process and operation mechanism of genes.The traditional batch transcriptome sequencing technology measures the average gene expression of the whole cell population,while the currently mainly used single-cell sequencing technology can simultaneously measure the whole genome expression levels of thousands of individual cells.The previous gene regulatory network construction methods based on batch transcriptome sequencing data can not deal with the characteristics of time sequence and high dimensions of single cell sequencing data.In response to the problems in processing single-cell sequencing data,this thesis conducts in-depth research,and the main research contents and results are as follows:(1)GRN-PAGATE,a gene regulatory network construction method based on partition graph abstraction and transfer entropy,is proposed.This method does not need prior knowledge.Based on single-cell sequencing data,this method can construct gene regulatory networks by trajectory inference,calculating transfer entropy to determine the regulatory direction,statistical analysis,and visualize the network structure using Cytoscape software.Experiments were conducted on the DREAM3 challenge dataset to compare with the existing gene regulatory network construction methods,and the experimental results showed that the performance of the method is basically equal to that of the GRNTSTE method and superior to that of the DynGENI3 and SCRIBE methods.The method was applied to a real single-cell sequencing dataset of rat embryos,and the process to construct the final network was elaborated,demonstrating that the method is able to identify key gene regulatory relationships in practice with lower time complexity than DynGENIE3,and TENET methods.(2)GRN-RNNSEM,a gene regulatory network construction method based on recurrent neural network and structural equation model,is proposed,which extracts single-cell sequencing data features by variational self-encoder and recurrent neural network,and then constructs gene regulatory network by structural equation model based on data features.Experiments were conducted on seven single-cell sequencing datasets with variable dimensions in the BEELINE framework,and compared with the existing methods.The experimental results showed that the method improves the accuracy by 5%to 10% over the DeepSEM method on different datasets,partially improves over the PIDC and GENIE3 methods,and achieves significant improvement over the SCODE,PPCOR,and SINCERITIES methods.The experimental results demonstrated that the method can provide higher resolution and accuracy in constructing large-scale gene regulatory networks,and also provide guidance for causal analysis of other large-scale time-series data.(3)A gene regulatory network construction system based on single-cell sequencing data was developed.The system encapsulates the GRN-PAGATE method proposed in Chapter three,and helps biologists analyze single-cell sequencing data,construct gene regulatory networks,and visualize the network structure through a graphical interface. |