Font Size: a A A

Research On Identification Of Essential Genes And Prognostic Gene Signatures

Posted on:2021-01-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:C QinFull Text:PDF
GTID:1360330614972298Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Genomics,transcriptomics,and proteomics data are rapidly growing and many biological mechanisms have been revealed.Identifying special functional genes through multiomics data has become an important research work.Essential genes are those genes of an organism that are thought to be critical for its survival.Prognostic gene signature is a group of genes associated with patient survival.The identification of essential genes is necessary not only for disease diagnosis and drug design but also for understanding the molecular mechanisms of cellular life.Proteins are the product of genes that are formed from translation of a mature mRNA molecule.Protein-Protein Interaction(PPI)network is a network connecting each protein.Due to the shortcomings of experimental methods requiring considerable time and resources,there are many computational methods proposed which are based on the protein-protein interaction network.However,the prediction accuracy of computational methods for essential genes requires further improvement.Prognostic gene signatures are biological characteristics that are objectively measured and evaluated to predict the course of a disease or a response to a therapeutic intervention among patients with the same characteristic.They thus facilitate individual treatment choice and aid in patient counselling.Neuroblastoma is the most common extracranial solid tumor,usually occurring in early childhood.For Neuroblastoma patients,many classic prognostic markers,such as stage,DNA ploidy,transcription instability,and MYCN amplification have been used to predict the prognostic outcome of patients.Due to the lack of the Copy Number Variance(CNV)datasets,a systematic analysis that investigates the relationship between such frequent gain/loss of chromosome bands and MYCN aberrations with patient prognosis has yet to be implemented.Identifying essential genes and prognostic gene signatures are studied in this dissertation.The main results are as follows:Topological-based method for identifying essential genes.Due to the poor accurary for identifying essential genes based on the PPI topological features,we proposed a new topological-based method,LBCC,which is based on the combination of local density,global density,and protein complex information.Many essential proteins were located in the dense subgraphs,we proposed the densities Deni and Den2 of a node v to describe its local properties in the network.Then the combination strategy of Den1,Den2,BC,and IDC was developed to improve the prediction precision,called LBCC.The experimental results demonstrated that LBCC outperformed traditional topological measures for predicting essential genes.LBCC also improved the prediction precision by approximately 10 percent on the YMIPS and YMBD datasets compared to the most recently developed method,LIDC.Finally,we applied LBCC to a human PPI network and found 5 potential essential genes.In summary,we proposed the densities Deni and Den2 of a node v to describe its local properties in the network and used it to improve the prediction precision of essential genes.Based on LBCC,we found 5 novel potential essential genes in human PPI network.Random forest-based method for identifying essential genes.To further improve the accurary for identifying essential proeins,a random forest-based method is proposed.New methods measuring orthologous information and subcellular localization associated with essentaility were proposed,and then a computational strategy named CoTB for identifying essential genes based on a combination of topological properties,subcellular localization information and orthologous protein information using the random forest model was proposed.The experimental results showed that the new method CoTB for identifying essential genes outperformed traditional computational methods and the most recently developed method,SON.In particular,our method improved the prediction accuracy to 89,78,79,and 85 percent on the YDIP,YMIPS,YMBD and YHQ datasets at the top 100 level,respectively.Finally,we applied CoTB to a human PPI network and found 5 potential essential genes.In summary,we proposed new methods measuring orthologous information and subcellular localization information,and proposed a new computational strategy combining different attributes using the random forest model to improve the prediction precison.Identifying prognostic gene signature associated with chromosome abnormality.To overcome the shortage of experimental data on chromosome abnormality,a computational method was proposed to infer the CNVs of sample-specific chromosome sub-bands from Neuroblastoma gene expression profiles.The resulting inferred CNVs(iCNVs)were highly correlated with the experimentally determined CNVs,demonstrating CNVs can be accurately inferred from gene expression profiles.Using this iCNV metric,we identified 58 frequent gain/loss chromosome sub-bands that were significantly associated with patient survival.Furthermore,7 chromosome sub-bands were still significantly associated with patient survival even when clinical factors,such as MYCN status,were considered.Particularly,the genes located on the chromosome sub-band Chrllp14 has high potential as a novel candidate prognostic gene signature for clinical use.Additionally,this computational framework could be readily extended to other cancer types,such as leukemia.In summary,we proposed a new method to infer the CNVs of neuroblastoma based on gene expression profiles and found a novel candidate prognostic gene signature.Identifying prognostic gene signature associated with MYCN and chromosome abnormality in neuroblastoma.Neuroblastoma patient prognosis is always associated with MYCN and chromosome abnormality.A computational method was proposed to define gene signatures that reflect MYCN and chromosomal aberrations including deletion of chromosome 1p(Chrlp_del)and llq(Chrllq_del)as well as chromosome llq whole loss(Chr11q_wls).Patient-specific MYCN score,Chrlp_del score,Chrllq_del score,Chr11q_wls score were calculated by a previously published method called BASE.The results showed that these gene signatures can be used to reflect the status of MYCN and chromosomes.We integrated MYCN score,Chrlp_del score,Chr11q_del score,Chr11 q_wls score and clinical variables into an integrative prognostic model,which displayed significant performance over the clinical variables or each genomic aberration alone,and therefore it can be served as a novel candidate prognostic gene signature to strengthen the power of outcome prediction,which could provide insights for further therapeutic interventions and surveillance programs for Neuroblastoma patients.In summary,we proposed a method to define gene signatures for MYCN and chromosomal aberrations.By using these gene signautres,the power of neuroblastoma patient outcome prediction was improved.
Keywords/Search Tags:Essential gene, Prognostic signature, Protein-protein interaction network, Gene expression profile, Chromosome copy number
PDF Full Text Request
Related items