Font Size: a A A

Analysis Of Gene Expression Profiles Of Esophageal Carcinoma Based On

Posted on:2016-05-08Degree:MasterType:Thesis
Country:ChinaCandidate:H X GuFull Text:PDF
GTID:2208330467999688Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Since the significant progress for the whole genome sequencing of cancer in2008, more and more researchers turned to the field of cancer gene data analysis so as to reveal the pathogenic mechanism of cancer from the perspective of cancer genes. Following the biological high-throughput technology become more mature, people made gene chips which stores the gene expression information from different samples in various factors. Through chip scanning system, the data which expressed in the form of matrix is called gene expression profile. But in front of the huge amount of gene expression profile data which with exponential growth, traditional data analysis and techniques has been difficult in time and accurate degree to meet the needs of research. Due to the high incidence of esophageal squamous cell carcinomas in our country, as well as a few systematic comprehensive data mining analysis for esophageal cancer gene expression data at present, thus the esophageal cancer gene expression data sets GSE23400and GSE20347which are free downloaded in public gene expression database GEO are chosen as the research objects in this paper. Clustering algorithm is often chose as the tool of data mining for the gene expression profile. However, the microarray expression data always have the characteristics of multiple dimensions, low sample and nonlinear which would reduce the efficiency and accuracy of the results of clustering. Therefore, it is necessary for data preprocessing in order to achieve a standard format, then by feature selection algorithm to simplify the dimension and reduce the number of gene expression data to screen the differentially expressed genes. According to the characteristics of esophageal cancer gene expression profile, a large number of literature researches are carried out in this paper. Based on the study of bioinformatics theory and technology, and summarized the feature selection method, clustering algorithm, especially bi-clustering algorithm which is relatively popular in recent years. The significantly differentially expressed genes are got by feature selection techniques. After clustering analysis, the classic bi-clustering algorithm which is called Cheng-Church algorithm is applied in esophageal cancer gene expression profile for the first time. Finally, the enrichment analysis is adopted to every clustering results to make the data more accurate.The results of enrichment analysis are not only presented in the form of chart but also the function of some genes are predicted. The enrichment analysis indicates that the double clustering on esophageal cancer gene expression profile is superior to the traditional clustering analysis. The study in this paper is a comprehensive and systematic data analysis of esophageal cancer gene expression profile, the results not only could provide new ideas and technology to the basic research of esophageal cancer, but also provide more valuable data for clinicians in the diagnosis and treatment of esophageal cancer.
Keywords/Search Tags:Bi-clustering algorithm, Feature selection, Enrichment analysis, Gene expressionprofiles, Esophageal cancer
PDF Full Text Request
Related items