| Mammalian tissues are highly complex systems,composed of up to trillions of cells that differ in type,time and space,but that coordinate with each other to form unique microenvironments that maintain organ function and process information and thus determine cell identity.Countless cells with different functional types coupled with developmental(temporal)and regional(spatial)differences constitute a major component of transcriptional heterogeneity in mammalian organ tissues.Therefore,probing the functional identity and localizing the spatial location in individual cells will contribute to a better understanding of the complex organ tissues.Single-cell RNA sequence(scRNA-seq)technology can capture the transcriptome expression levels of tissues at single-cell resolution,but it lacks the spatial location of cells.To remedy this deficiency,researchers have proposed spatially resolved transcriptome(SRT)technology.However,most of the existing spatially resolved transcriptomics techniques still struggle to achieve single-cell resolution or fail to capture sufficient gene reads.Considering the respective characteristics of these two techniques,researchers have proposed to solve the problems of spatially resolved transcriptomics techniques by combining single-cell transcriptome data with some computational methods.One of the mainstream algorithms is the deconvolution class approach.It uses single-cell transcriptome data from the same species/tissue to resolve the cellular composition of individual capture sites in data generated by the lower resolution spatial transcriptome(ST)technology,thus enabling the localization of individual cells.However,deconvolution methods have difficulty detecting cell types defined by sparse or ambiguous gene markers due to the obstacle of spatial "dropout" events.Inferring the spatial modularity pattern of cell types by integrating scRNA-seq data and ST data remains a challenge.To this end,this thesis proposes a partial least squares-based approach(SpaMOD)to simultaneously integrate single-cell transcriptome and spatial transcriptome data,as well as cell similarity network and spot neighbor network,to identify cell-spot comodules for deciphering spatial modularity patterns of tissues.Here,instead of pursuing and deciphering individual spot composition,we explore spatial patterns by integrating these two data modalities to find cell-spot comodules based on cell and spot correlations in the presence of a set of common genes.SpaMOD not only identifies the spatial location of different cell types in a tissue region,but also tags associated genes.The genes screened by SpaMOD in the identification of comodules are module-specific and their enrichment in biological functions characterizes the biological processes occurring in the current microenvironment of the comodule.In this thesis,we explored the spatial modularity pattern on four real scRNA-seq and ST datasets from mouse brain,human granuloma and two pancreatic ductal adenocarcinoma tissues,and constructed twenty simulated data sets for validating the effectiveness of a priori information network incorporation and for comparing the spatial localization ability.The experimental results show that SpaMOD is a powerful tool for discovering biologically significant cell-spot comodules,and the identified cell-spot comodules provide detailed biological insights into the spatial relationships between cell populations and their spatial locations in tissues. |