Font Size: a A A

Single-cell Transcriptome-based Perturbation Effect Evaluation And Automatic Cell Type Identification

Posted on:2023-03-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:B DuanFull Text:PDF
GTID:1520307316954839Subject:Biology
Abstract/Summary:PDF Full Text Request
Since the development of the first single-cell transcriptome sequenceing technology(scRNA-seq)in 2009,it develops rapidly.The biggest advantage of scRNAseq is to reveal the heterogeneity of cells in terms of gene expression,providing a powerful research tool for uncovering the precise biologic circuits in the organism.However,scRNA-seq data has the characteristics of high noise,high heterogeneity and high sparsity,which makes the analysis of such data very difficult.In this thesis,three high-efficiency analysis tools were developed for two types of scRNA-seq data,i.e.,single-cell CRISPR screening data under artificial gene perturbation and scRNA-seq in the natural state.The main contributions of this thesis are as follows:(1)We developed a topic modeling based quantitative analysis system for singlecell CRISPR screening technology,named MUSIC.MUSIC is able to link cellular genotype and phenotype while removing inherent noise efficiently,and quantitatively assesses the impact of each gene perturbation on cell function at single-cell resolution from three perspectives,i.e.,prioritizing the gene perturbation effect as an overall perturbation effect,in a functional topic-specific way,and quantifying the relationships between different perturbations.With comprehensive evaluation and comparison,MUSIC showed stable and superior performance,and provides an effevtive analysis tool to elucidate the connection between gene function and biological circuits based on single-cell CRISPR screening technology.(2)In addition to the artificially perturbed single-cell CRISPR screening data,another data type is the scRNA-seq data in the natural state.In this thesis,with scRNAseq reference data(with cell type labels),we developed a metric learning based method,named scLearn.scLearn can simultaneously identify existing cell types and discover potential novel cell types in the reference dataset.What’s more,scLearn is able to simultaneously identify the cell type and temporal state of newly sequenced cells against a scRNA-seq reference dataset with temporal state information.Through the comprehensive evaluation and comparison,we proved scLearn as an excellent automatic cell type identification and novel cell type discovery tool,which effectively promotes the development of scRNA-seq analysis.(3)To make up the deficiency of scLearn that scLearn can only perform automatic cell type identification with one reference dataset,this thesis further developed a method,named mtSC.mtSC is based on multitask deep metric learning,and can identify cell types with multiple scRNA-seq reference datasets of the same tissue of the same or cross species from multiple experimental platforms.mtSC expanded the range of available reference datasets,and its performance can be further improved with the number of integrated reference datasets increased,showing its excellent application potential.In conclusion,facing two kinds of scRNA-seq data,this thesis developed three effetive analysis tools,i.e.,the gene perturbation quantitative assessment system MUSIC,the automatic cell type identification and novel cell type discovery system scLearn,and the multiple references based automatic cell type identification system mtSC.They are in the same line in terms of research field and complement each other in terms of function,which will effectively promote the development of scRNA-seq data analysis and have important theoretical and practical significance.
Keywords/Search Tags:Single-cell transcriptome, Perturbation effect evaluation, Cell type identification, Data integration
PDF Full Text Request
Related items