Font Size: a A A

Research On Single-cell Data Factor Analysis And Multi-omics Integration Algorithm Based On Deep Generative Model

Posted on:2023-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:R Y LiuFull Text:PDF
GTID:2530306791481544Subject:Bioinformatics
Abstract/Summary:
Cells are the basic units of life.Single-cell sequencing technology performs highthroughput sequencing analysis of genome,transcriptome,epigenome and other omics at the cellular level,providing the possibility to study the laws of life activities at single-cell resolution.Among them,single-cell transcriptome sequencing,as the earliest single-cell sequencing technology,has changed the understanding of traditional biology.In recent years,in order to break through the limitation that a single omics is insufficient for a comprehensive understanding of biological systems,single-cell multi-omics sequencing technologies have emerged that can simultaneously detect multiple omics such as genome,proteome,and epigenome in the same cells.It provides a new opportunity for comprehensively revealing the heterogeneity between cells and exploring the interaction mechanism of different molecules in cells.With the maturity of single-cell sequencing technology,the development of single-cell data analysis tools has gradually developed,and dimensionality reduction has been widely studied as a key link.However,more and more dimensionality reduction methods for single-cell data,while achieving success,also face challenges such as poor interpretability,lack of adaptability,and insufficient robustness.In response to the above challenges,this paper first proposes a method for factor analysis of single-cell transcriptome data based on deep generative models.This method performs nonlinear dimensionality reduction of data by combining deep generative model and factor analysis to improve the interpretability of dimensionality reduction results,and automatically selects the number of factors by introducing Beta process to enhance the adaptability of the model.In this paper,the method is applied to the mouse embryonic developmental cell transcriptome data to evaluate the biological information evaluate the bioinformatics retention performance of dimensionality reduction results,and to perform downstream biological analysis such as cell type annotation and factor and loading visualization.The results shown that this method is capable of dimensionality reduction of single-cell transcriptome data and exhibits better performance than other factor analysis methods.Factors automatically selected by this method can be used to identify biological processes shared between different cell types,and loadings can be used to identify genes that play a role in this process.Then,this paper proposes a single-cell multi-omics data fusion analysis method based on a deep generative model.The method uses the generative model to jointly reduce the dimensionality of different omics data,and remove the batch noise in the data through the co-training between the generator and the discriminator to improve the robustness of the model.In this paper,the method is applied to single-cell transcriptome and proteome datasets of human peripheral blood mononuclear cells,evaluates the performance of its dimensionality reduction results in preserving biological heterogeneity and eliminating batch effects,and performs downstream biological analysis.The results shown that the method can perform batch correction and omics fusion simultaneously,and well balance the preservation of biological information and the elimination of batch noise.The results obtained by this method of dimensionality reduction can combine the advantages of different omics and be used to explore the heterogeneity of rare cell types.
Keywords/Search Tags:Single-cell analysis, Deep learning, Deep Generative Models, Factor analysis, Multi-omics fusion
Related items