Font Size: a A A

Calculation Method Of Tumor Heterogeneity Based On Single Cell Point Mutation Data

Posted on:2024-08-28Degree:MasterType:Thesis
Country:ChinaCandidate:J Q YanFull Text:PDF
GTID:2544306926974709Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Cancer is one of the most serious diseases affecting human health.The occurrence and development of cancer has always been the focus of cancer related research.Studies have shown that tumor heterogeneity is one of the important factors affecting tumor prevention,treatment and prognosis,so deciphering tumor heterogeneity has become an important task in many cancer researches.With the rapid development and popularization of single-cell sequencing technology,it has become an important means to detect tumor heterogeneity.However,on the one hand,due to the high technical noise of single-cell sequencing technology,complex problems such as cell doublets,false positive errors,false negative errors and missing data may be introduced into single-cell DNA sequencing data,thus increasing the difficulty of data analysis.On the other hand,with the increasing scale of single-cell DNA sequencing data,the analysis of single-cell sequencing data also faces challenges in terms of computational efficiency.Recently,clustering of single-cell DNA sequencing data provides an excellent way to solve the above problems.In this thesis,two different methods for calculating tumor heterogeneity based on single-cell point mutation data are proposed,and a tool for calculating tumor heterogeneity is developed.The main research contents and achievements are summarized as follows:1.A single-cell point mutation data clustering method CBM based on parametric modeling is proposed.CBM formulates the binary mutation data under a probabilistic framework through parameterizing false positive errors,false negative errors,presence probability distribution of subclones and their binary mutation profiles.In order to solve the difficulty of discrete parameter optimization,Gibbs sampling for mixtures is employed to iteratively sample cell-to-cluster assignments and cluster centers from the posterior.We evaluated the performance of the proposed method on simulated datasets and real datasets respectively,and the experimental results show that the proposed method can provide higher computational accuracy even on high-dimensional sparse mutation data.2.A method bmVAE based on variational autoencoder is presented for data clustering of singlecell point mutations.bmVAE firstly uses variational autoencoder to reduce the dimensions of the original point mutation data of tumor cells,then uses Gaussian mixture model to cluster the lowdimensional data,and finally uses Gibbs sampling method to estimate the genotypes of each subclone.We comprehensively evaluate the performance of the proposed method on simulated datasets,and further validate the effectiveness of the proposed method on two real datasets.The results show that bmVAE is very effective in inferring intra tumor heterogeneity and has certain advantages compared with existing advanced methods.3.A tool was designed to identify tumor subclones from single-cell point mutation data and analyze tumor heterogeneity.This tool provides a variety of clustering methods,such as RobustClone and BnpC,as well as CBM and bmVAE,as presented in this thesis.This tool uses binary data of single-cell point mutation as input to provide users with information such as subclones labels,subclones genotypes or single-cell point mutation data after denoising.
Keywords/Search Tags:Cancer, Tumor heterogeneity, Single-cell sequencing techniques, Variational autoencoder, Bioinformatics
PDF Full Text Request
Related items