BackgroundOvarian cancer is one of the most common gynecologic malignant tumor and the leading cause of cancer mortality in female reproductive system tumor. It is estimated21,550cases of invasive ovarian cancer were diagnosed and14,660deaths attributed to ovarian cancer in2009of the United States, with more than114,000women died of the disease every year in the world. It had serious damaged the women health. Due to the low specificity of symptoms in the initial stages of the disease and the absence of efficient methods for early diagnosis, the majority of ovarian cancer diagnosed at advanced stage. While early-stage ovarian cancers are highly curable, the5years survival rate is only about20%to30%at the advanced stage.It is well known that different individuals with the same tumor behave different symptoms, and patients who have the same symptoms need different diagnosis and therapy regimens. Although there are pathologic staging system and clinical staging system for determining prognosis and treatment options, due to changes in molecular level in the individuals makes the different outcomes of the patients who have the same pathologic stage and were given the same treatment. The fundamental reason of this phenomenon is that the molecular mechanisms of the specific cancer occurrence, development and metastasis were not known clearly by scientists and clinical doctors. One approach to deal with this problem is to utilize the bioinformatics methods on numerous clinical samples to investigate the alterations in molecular level of the specific cancer.The chromosome aberrations is widely observed in tumorigenesis, which includes structural variations and copy number variations. The copy number variations are particularly common in solid tumors, such as in tumors of breast, prostate, ovarian, lung and head and neck. Several studies reported that gene copy number variations play an important role in cancer development and progression. In different tumors, the number, size and magnitude of copy number variations vary extensively and that is likely to reflect the differences of individual tumors to escape from normal protective cellular environment. Investigated the copy number variation can help find them playing the role in the ovarian cancer. The array comparative genomic hybridization capable of scanning ovarian cancer genome and identify the aberration regions, then further study the genes which located in these aberration regions can be found their role in the pathogenesis of ovarian cancer.Tumor cells gene amplification is an important mechanism to increase the expression of cellular oncogenes. Similarly, gene deletions and expressed down-regulation may turn off critical tumor suppressor genes.To better elucidate ovarian cancer etiology and identify prognostic gene sets, many studies have performed microarray analysis of gene expression profile. While the identified gene sets indeed show significant associations with survival in their respective datasets, very few genes are common to all the different studies. The discrepancy in the results of gene expression analysis may result from multiple experiment protocols, different statistical approaches, or inhomogeneous cohort characteristics. One possible strategy to increase homogeneity in these findings to analyze gene expression in conjunction with DNA-level changes such as copy number.The impact of different types of gene copy number on gene expression were varies. For example, recent studies have shown that up to40%-50%of the highly amplified genes are also expression up-regulation in tumor. The impact of deletion on down-regulation is less clear, which is partly due to the fact that small deletions and following down-regulation is technically more challenging to identify. In general, it has been demonstrated that10%-15%of all gene expression changes are directly associated with gene copy number changes. The identification of the genes that are either amplified and up-regulation or deleted and down-regulation may reveal variations critical to tumor pathogenesis.ObjectiveIn this study, we aim at identifying the copy number variations regions and the genes contained in these regions using the TCGA copy number data, then we selected the differentially expressed genes from these regions. We further explored the gene expression change between samples with copy number amplification and without copy number amplification. In term of the genes which both have copy number amplification and expression up-regulation, we studied the association of copy number variation and differential gene expression.MethodIn order to analyze the copy number variations of ovarian cancer, we used the Circular Binary Segmentation to the samples downloaded from the TCGA database, from CBS we obtained the copy number variations of each sample. After the CBS procedure, we identified the significant copy number variation regions and genes which contained in those regions by the statistical approach GISTIC.The expression data of the genes which have copy number amplification were extracted for both the samples of ovarian cancer and the normal. The method of Significance Analysis of Microarray was used to analyze the differential expression between the two types of samples and selected the up-regulated genes for further analysis.The copy number level of each gene with copy number amplification and up-regulation in the ovarian cancer samples was extracted from the GISTIC outcomes, and we divided the samples into two types, namely, the samples with copy number amplification and the samples without copy number amplification. The SAM method was used to identify the differentially expressed genes between the two types of samples. Then the z-scores of expression change and the Fisher’s exact test were used to analyze the association between the copy number amplification and differential expression of the genes. We further explored the functions of these significant genes using gene set enrichment analysis, pathway analysis. GATHER and GSEA can get the work done well.The CBS and SAM were performed using the package of DNAcopy and samr in the R software. The GISTIC was performed using the GISTIC module in the online platform GenePattern. Others statistical analysis was also conducted using the R platform.ResultsThrough CBS and GISTIC analysis, we found48regions with significant copy number amplification on21chromosomes and54regions with significant copy number deletion on22chromosomes. There are174candidate genes in these gene amplification regions, among them several genes have been proven to be the oncogenes, including EVI, KRAS, CCNE1and MYC. There were2712candidate genes in the deletion regions.Among the174amplification genes, SAM found55genes differentially expressed consisting of45up-regulated genes and10down-regulated genes.We checked the45genes with the same direction of copy number variation and differential expression direction, and found that there were40gene at least leading the gene generate the differentially expression between samples with copy number amplification and without copy number amplification samples, which suggests the association between gene copy number variation and differential expression.The bioinformatics method analyzed the40genes, we found them participation the cellular metabolism, cellular synthesis, cellular cycle and cellular apoptosis biological process. They also appeared in the CCNE1cellular cycle, MAPK signal pathway, TGF-beta signal pathway. We also found them overlap with several studies about tumor.ConclusionIn this paper, we identified the common significance copy number variations and the genes in the ovarian cancer through the CBS and GISTIC. We also detected differentially expression gene between the ovarian cancer and the normal ovary tissue and got the genes which have an association between the copy number variations and differentially expression, then used the bioinformatics method analysis those signature genes.As a paradigm of tumor research, the analysis strategy used in this study could be applied to other tumors research. The future study should focus on integrating more levels of genomic data to provide more significant knowledge and insight for molecular signature in the ovarian cancer. |