Font Size: a A A

Database Construction And Copy Number Variation Analysis Of Cancer Predisposition Genes

Posted on:2018-02-03Degree:MasterType:Thesis
Country:ChinaCandidate:R WeiFull Text:PDF
GTID:2334330515979926Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
According to the cell types,gene mutation can be divided into somatic mutation and germline mutation.Somatic mutation was transmitted only in somatic cell,which can’t heredity to next generation.However,germline mutation will be inherited from generation to generation.There is a class of cancer genes in which inherited mutations confer highly increased risks of developing cancer.So those genes are called cancer predisposition genes(CPGs).Identifying and investigating related biological mechanisms of CPGs can achieve the goals of cancer’s early prevention,diagnosis and treatment.At the same time,it also contribute to search cancer etiology,research pathogenesis mechanism,and develop cancer related drug.In the process of tumorigenesis,there are multiple mechanisms of action of CPG mutations.Most of them may play tumor suppressor gene role with mutations that abolish their function and contribute to carcinogenesis.Meanwhile,only a few CPGs predisposed to cancer that is the result of gain-of-function mutations.In the past few decades,with the continuous application of high-throughput strategies like genome-wide mutation analyses(including exome and genome sequencing),more and more CPGs were known.However,those genes and their molecular mechanisms are very dispersal.It is universally acknowledged that a larger number of databases have emerged which mainly focused on a particular class of cancer genes.But there is no database that focuses on CPGs.So to fill this gap,based on collecting and sorting susceptibility genes from different sources,there is an imminent work to develop a comprehensive resource about cancer predisposition gene.In this study,based on the data derived from five different data sources,we firstly constructed a complete database of cancer predisposition genes.In order to analyze the copy number variants(CNVs)of CPGs,we also investigated the systematic relationship between somatic CNVs and gene expression change in cancer predisposition in pan-cancer.In this paper,the main works will be introduced as following:1.A database about cancer predisposition genes was constructed to use easily for the researches,which developed a literature-based gene resource.To provide a comprehensive resource of CPGs,we firstly performed a collection and review of peer-reviewed literature from five sources,including Ranhman’s data,PubMed abstract,GeneReview,Online Mendelian Inheritance in Man(OMIM)and Gene Reference Into Function(GeneRIF).After checking manually,we collected 827 human(724 protein-coding,23 non-coding and 80 unknown type(the type of gene is labelled as ’unknown type’ in NCBI)),658 mouse and 637 rat CPGs.To better understand those CPGs,we used data mining to collect annotation information for each CPG,including general information,gene expression,methylation sites,post-translational modification(PTM)information,germline mutation data,interacting partners,pathway information and drug information.On this basis,we constructed cancer predisposition gene database(http://bioinfo.ahu.edu.cn:8080/dbCPG/index.jsp).In dbCPG,users can browse,search and upload CPGs information.Finally,to assess the function of 724 human coding CPGs,we performed function enrichment analysis using KOBAS and DAVID,and network analysis using GenRev tools.dbCPG was the first database focused on cancer predisposition genes,which not only summarized all existing research results,but also provided a more accessible data resources platform for cancer researchers.2.An analysis was performed to research the copy number variants(CNV)using somatic mutation data in CPGs.Based on "two-hit" hypothesis,cancer will be developed with the accumulation of germline and somatic mutations.In cancer biology,integrating germline and somatic data is important to identify genes and molecular functions.Researches have uncovered that CNV are involved in cancer susceptibility.In the present study,using CNV data and gene expression data,we inquired the relationship between somatic CNVs and gene expression of CPGs.Firstly,we identified 729 CPGs with precise CNV information,which derived from dbCPG.After setting a threshold of 2,we harvested 128 CPGs which the number of samples with copy number loss(CNL)was at least twice that of the number of samples with copy number gain(CNG).After connecting these CNV data with gene expression data,we gained 49 CPGs with concordant CNL and decreased gene expression.Then we found there are five CPGs with more than 50 tumor samples:MTAP(216),PTEN(143),MCPH1(86),SMAD4(63),and M1NPP1(51).Moreover,to explore the common functions and better understand the cellular events associates with 49 human CPGs with decreasing expression induced by CNLs,we performed a network analysis.Results of sub-network analysis revealed that these 49 CPGs were tightly connected.In summary,this is the first study investigated CPGs with concordant CNL and down-regulation in pan-cancer.Despite there are some shortcoming,those results will help to comprehend the CPG’s biology function in cancer development.
Keywords/Search Tags:cancer predisposition gene, database, copy number variant, gene expression, network module
PDF Full Text Request
Related items