Font Size: a A A

Deconvolution Of Heterogeneous Tumor Samples Using Partial Reference Signals

Posted on:2021-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:S W NanFull Text:PDF
GTID:2404330626454827Subject:Basic mathematics
Abstract/Summary:PDF Full Text Request
The study of tumor heterogeneity is helpful to understand the molecular mechanism of tumor metastasis and has potential guiding significance for the diagnosis and treatment of tumor patients.Therefore,Deconvolution of heterogeneous bulk tumor samples into distinct cell populations is an important problem.But in real clinical practice,only partial references are available.Typical approach dealing with this problem is to regress the mixed signals using available references and leverage the remaining signal as a new cell component.However,as indicated in our simulation study,such approach tends to over-estimate the proportions of known cell types and fails to detect novel cell types.In this paper,we developed PREDE,a partial reference deconvolution method based on iterative non-negative Matrix Factorization.Our method is verified to be effective in estimating cell proportions and expression profiles of unknown cell types based on simulated datasets at a variety of parameter settings.Applying our method to TCGA tumor samples,we found that cell proportions of cancer cells,rather than infiltrating immune cells,could better separate different subtypes of tumor samples.In addition,we deconvolved three tumor samples of breast cancer,skin cancer and bladder cancer and analyzed the survival of tumor patients with the proportion of cell types obtained.On the whole,Our method generalizes the existing deconvolution of heterogeneous tumor samples and could be widely applied to varieties of bulk high throughput data.The first chapter of this paper mainly summarizes the background and significance of the study of the deconvolution of heterogeneous tumor samples,introduces the history and development of the deconvolution algorithm in detail,and compares the application,advantages and disadvantages of these deconvolution methods.In chapter 2,the principle and solution of the classical nonnegative matrix decomposition(NMF)algorithm are introduced in detail.On the basis of the NMF model,the PREDE model is obtained by modifying the mathematical form of the NMF model through a part of columns of fixed basis matrix W.In chapter 3,the processed CCLE data were input into the PREDE model for solution,and the model was analyzed from four aspects: the selection of cell type number,the comparison between the PREDE model and the existing methods by changing the model parameters,the addition of immune cells for research,and the simulation with mixed RNA-seq data of three kinds of rat tissues.In chapter 4,the PREDE model is used to study the real data of TCGA.The fifth chapter summarizes and prospects the paper.The advantages and disadvantages of PREDE model are listed and the future research directions are given.
Keywords/Search Tags:Heterogeniety, Partial reference signals, Non-negative Matrix Factorization, PREDE
PDF Full Text Request
Related items