| Traditional bulk RNA sequencing technology performs sequencing on population cells,and ultimately reflects the average of gene expression across population cells,which tends to conceal some critical information.In fact,cells are heterogeneous.Gene expression levels vary among cells even within the same cell population.Single-cell RNA sequencing(sc RNA-seq)technology extracts transcripts from single cell for sequencing.Cell types can be identified and cell heterogeneity can be studied based on the similarity and difference of single-cell transcriptomic profiles.Benefiting from high precision and high sensitivity,sc RNA-seq technology has been widely used to study complex tissues or organs.Cell atlas construction is an important aspect of single-cell research to characterize the complex cellular composition in tissues and organs.Cell atlas is usually derived from a single sc RNA-seq dataset.However,it is difficult to capture all cells and genes in a single experiment,and thus cannot obtain panoramic information of cells.Data integration brings more possibilities for gathering more comprehensive information of cells and building a more complete cell atlas.At present,a considerable number of datasets and cell atlases have been accumulated under the rapid development of sc RNA-seq technology.However,few studies integrate multiple datasets to construct an integrated cell atlas.In this paper,the human liver tissue is used as the research object.The integrated cell atlases of human normal liver tissue and human hepatocellular carcinoma(HCC)tissue are constructed based on data integration.Seu Liver Atlas,a database on human liver transcriptome,is constructed for query and reference.The main research results of this paper are as follows:(1)Liver i Atlas,an integrated cell atlas of human normal liver tissue,is constructed.In this paper,a unified pipeline of sc RNA-seq data processing and a practical pipeline of integrated cell atlas construction are established.6 sc RNA-seq datasets from human normal liver tissues are collected and processed.The integration effects of 4 integration methods are evaluated,and the optimal one is selected to construct Liver i Atlas of which the reliability is verified.The results show that data integration method Seurat is suitable for this study;10 major cell types and 24 fine cell subtypes are annotated in Liver i Atlas;in single-cell type identification,the accuracy rate of using Liver i Atlas as reference is higher than that of using single dataset as reference and that of automatic annotation tools,which verifies the reliability of Liver i Atlas.(2)HCC i Atlas,an integrated cell atlas of human HCC tissue,is constructed.This paper firstly distinguishes malignant cells by copy number variation analysis.5 sc RNA-seq datasets of nonmalignant cells from human HCC tissues are processed according to the pipeline.The optimal integration method is evaluated and selected to construct HCC i Atlas of which the reliability is verified.The results show that data integration method Seurat is suitable for this study;8 major cell types and 22 fine cell subtypes are annotated in HCC i Atlas;in single-cell type identification,the accuracy rate of using HCC i Atlas as reference is higher than that of using single dataset as reference and that of automatic annotation tools;in deconvolution of population cell components,HCC i Atlas is also proven to be reliable and valuable.(3)Seu Liver Atlas,a database on human liver transcriptome,is constructed.Liver i Atlas from 6sc RNA-seq datasets and HCC i Atlas from 3 sc RNA-seq datasets are included in Seu Liver Atlas.The former contains 27 samples,52,435 cells and 37,982 genes,while the latter contains.47 samples,83,006 cells and 29,487 genes.3 datatables are designed: 2 cell type-specific expressed gene data tables corresponding to Liver i Atlas and HCC i Atlas(3,916 entries for the former and 2,514 entries for the latter),as well as 1 human liver single-cell transcriptome data information table(26 entries).Seu Liver Atlas has four pages: Home,Atlas,Dataset,and Help.Seu Liver Atlas displays dynamic UMAP graphs and celluar composition graphs,information on canonical marker genes and cell typespecific expression genes,as well as data literature information focusing on human liver single-cell transcriptome studies.Seu Liver Atlas provides a variety of functions,including interactive browsing of atlas information,cell atlas data download,instant gene-specific and cell type-specific search,tissue-specific and cell type-specific gene list screening and download,instant search and linking of data literature.Seu Liver Atlas plays a role in the field of liver single-cell transcriptome research,including data integration analysis,cell type annotation,gene function enrichment,deconvolution of cell components in population tissue and identification of single cell types,which have application value. |