Font Size: a A A

Research On Single Cell Clustering Based On Graph Similarity Learning

Posted on:2022-08-26Degree:MasterType:Thesis
Country:ChinaCandidate:W JiangFull Text:PDF
GTID:2480306569497514Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of sequencing technology,single cell clustering is becoming more and more important in various fields of biology.Single cell sequencing technology is able to reveal the heterogeneity and functional diversity of unknown cell communities,which is very helpful for the study of cell development and tissue differentiation.At the same time,with the rapid development of biological network analysis technology,graph similarity learning has been widely used in biological data analysis.Graph similarity learning aims to find the similarity between graphs,and cell similarity has always been the focus of researchers in single-cell clustering research.Therefore,In this dissertation,graph similarity learning is employed to calculate the similarity between cells.The research of graph similarity learning in graph structure can be divided into two domains: one is graph similarity measurement,which is represented by graph kernel;the other is graph embedding method represented by graph convolution neural network.Therefore,the application of graph similarity learning to single-cell clustering can be divided into two domains: one is the optimization of cell similarity measurement;the other is the low dimensional vectorization of cells.For the optimization of cell similarity measurement,the existing cell similarity measurement calculates cell similarity only based on sequencing data.At present,it is difficult to improve the clustering accuracy of such methods.In this dissertation,we propose a data integration method to integrate the data form of cells into graphs.Firstly,the method analyzes the gene expression profile of single cell,extracts the co-expression network and protein interaction network for network fusion.Then,based on the fusion network and single cell gene expression profile,the cell graph data structure is extracted.Finally,this dissertation uses graph similarity measure to measure the similarity of cells,and optimizes the cell similarity measure.The experimental results show that the accuracy of single-cell clustering method based on graph similarity measurement is significantly improved.Aiming at the problem of low dimensional vectorization representation for cells,this dissertation proposes a single cell clustering method based on graph convolution neural network.In this dissertation,the high-dimensional structure data of cells are embedded into the low dimensional vector space by graph convolution neural network.Then,the vector based clustering method is used to clustering single cells.This method transfers the feature bag model from image domain to graph domain,and solves the problem that the node aggregation method cannot automatically learn the aggregation weight in unsupervised whole graph embedding.Moreover,the whole graph embedding by feature word bag can fuse all the data distribution information in the sample space,and get a better representation of the whole graph embedding.Experimental results show that this dissertation is better than current graph embedding methods,and effectively improves the clustering accuracy of single cell.
Keywords/Search Tags:single-cell RNA sequencing data, unsupervised clustering, graph similarity, graph embedding, graph convolution network
PDF Full Text Request
Related items