The Role Of Tumor Purity In Differential Gene Expression And Tumor Subtype Clustering

Posted on:2018-05-20

Degree:Doctor

Type:Dissertation

Country:China

Candidate:W W Zhang

Full Text:PDF

GTID:1314330515976944

Subject:Basic mathematics

Abstract/Summary:

PDF Full Text Request

Differentially expressed gene analysis and tumor sample classification greatly benefit therapeutic development and facilitate application of precision medicine on patients.However,solid tumor tissue obtained from clinical settings is the non-cancerous cells present in and around a tumor,including normal tissues,infiltrating immune cells,stromal and blood vessels.In addition,the incorporation of normal cells may have an adverse effect on differentially expressed gene analysis and tumor sample classification.Therefore,establishing appropriate statistical models with the consideration of tumor purityare immediately needed for differentially expressed gene analysis and tumor sample classification.In this thesis,we performed systematic research on these two problems in computation biology.First,we studied the effect of tumor purity on the analysis of differentially expressed gene.Simulation studies demonstrate that tumor purity has multiplicative effect on differential expressed,instead of additive.So ignoring tumor purity for differentially expressed or differentially expressed with the consideration of tumor purity by using tumor purity as an additive covariate gives biased results.To solve the problem,we design the method,based on a generalized least square procedure and Wald test,to test the difference between normal and tumor samples for each gene.The analyses of TCGA data demonstrate that the proposed method provides more improved results both in the number of differentially expressed gene,the consistency of the test statistics across different cancer types and the functional relevance to cancer types compared with t-test and limma.Second,we systematically investigated the impact of tumor purity as a confounding factor in unsupervised clustering of tumor samples,and proposed a statistical model to adjust purity effect in tumor sample clustering.We first found that under traditional k-means and NMF approaches,tumor purities bias the clustering results,samples with similar purities are likely to cluster together,and tumor samples with low purities tend to be misclassified.To overcome the problem,we designed a model-based statistical method for subtype classification based on DNA methylation data.In our method,methylation levels from tumor samples at each CpG site are modeled as mixture of normal distributions.Parameter estimation and sample clustering are performed through an EM type algorithm.Based on simulation,InfiniumClust achieved more robust and accurate results compared with k-means.When applying to real TCGA tumor samples,InfiniumClust obtained the least biased clusters comparing with k-means and the well-established NMF method.

Keywords/Search Tags:

differentially expressed gene, generalized linear model, DNA methylation, EM algorithm, tumor purity, cancer subtype

PDF Full Text Request

Related items

1	Study On Differential Methylation Analysis Methods Of Tumor Samples With Uneven Tumor Purity
2	Tumor Purity Estimation And Differential Methylation Analysis Based On DNA Methylation Data
3	The Application Of The R Language Package InfiniumPurify In Tumor Purity Estimation And Differential Methylation Analysis
4	The Research Of The Differentially Expressed Genes In Disease Based On The Granular Computing
5	Research On Tumor Purity Estimation Method By Considering Intra-tumor Heterogeneity
6	A Method For Estimating Tumor Purity Using DNA Methylation Chip Data And Consensus Information Differentiation Sites
7	Differential Methylation Analysis And Prediction Of Anticancer Drug Susceptibility
8	Generalized Latent Variable Models With Non-linear Effects
9	Suppressive Roles And Mechanism For A Differentially Expressed Gene In Human Cervical Carcinoma
10	Screening And Identification Of The Differentially Expressed Genes In Wilms’ Tumor