Font Size: a A A

Mining Database About T2DM, Obesity And Cancer Basing On GEO Dataset

Posted on:2015-07-29Degree:MasterType:Thesis
Country:ChinaCandidate:J H ChenFull Text:PDF
GTID:2180330422482438Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
GEO database consists of a part of National Center of Biotechnology Information(NCBI). As one of the world’s largest database of gene chip, GEO database covers variousHigh-throughput data, mainly including data of expression profile obtained from genemicroarray. The amount of data increased exponentially in GEO database. Differentialexpression analysis, analysis of molecular signals and association, and analysis of generegulatory networks were used to explore the GEO database. This research mainly focused onmining database about T2DM (type2diabetes), obesity and cancer.T2DM is characterized by insulin resistance (IR) and deficient β-cell function. Obesityis mainly caused by the imbalance of diet and activity.54single-nucleotide polymorphism(SNP) susceptibility to T2DM and nearly more than100SNP susceptibility to obesity havebeen identified by genome-wide association studies (GWAS) and meta-analyses. Analysis ofexpression quantitative trait loci (eQTL) has become a great tendency to study T2DM andobesity. Gene expression data was gradually applied in T2DM and obesity from a singletissue to multiple tissues research. This study identified candidate genes of T2DM and obesitythrough gene expression profile in multiple tissues or cells. Meanwhile, candidate genesnearby susceptibility loci have also been filtered out.23candidate genes displaying highpercentages of differential expression were filtered out. Among of them,14candidate genes(9genes) were identified in whole genome (nearby susceptibility loci) respectively.NCKAP5L and SP1significantly differentially expressed in many tissues, and located within1Mb of susceptibility SNP. Thus, NCKAP5L and SP1may be novel candidate genes forT2DM and obesity. Most of remaining genes have been reported to be associated with T2DMand obesity. Novel candidate genes need further experimental evidence.Metabolism of cancer is quite complicated. High-throughput gene chips are widely usedfor research of cancer. Gene mutation displays high heterogeneity in different stages anddifferent subtypes of cancer. In order to explore the mechanisms of gene mutation in cancer,this research assumed that active expression of genes was more prone to mutation. This studyperformed analysis of cluster and correlation through different cancer gene mutation andexpression profiles of multiple tissues. Moreover, GC content was also calculated nearby mutated sites of genes with high mutation rate. In all, this study found that correlation is notevident. Only in ovarian cancer, genes that mutation frequency is more than5%have highercorrelation with expression level. Correlation coefficient reaches0.87~0.97. However, moresamples are needed for further study.The study introduced various methods to explore database, mainly about GEO database.Basing on GEO database, this research studied T2DM, obesity and cancer specifically, andprovided reference and methods for further mining database.
Keywords/Search Tags:GEO database, Type2diabetes, obesity, cancer, differential gene expression, SNP (single-nucleotide polymorphism)
PDF Full Text Request
Related items