Font Size: a A A

Research On Lung Cancer Related Gene Recognition Based On Network Topology And Controllability

Posted on:2020-06-11Degree:MasterType:Thesis
Country:ChinaCandidate:S W XiaoFull Text:PDF
GTID:2370330599961794Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
With the development of global industrialization and the aggravation of environmental pollution,lung cancer has become the leading cancer fatality rate in the world.By identifying the important nodes of lung-related molecular networks,we can deepen our understanding of lung function and mechanism.According to the order of the central rule of biogenetics,this thesis mainly focuses on the following three aspects: the identification of key genes in the lung gene regulatory network,the identification of key transcription factors in the lung transcription factor regulatory network,and the identification of key proteins in the lung protein interaction network.The main research work of this thesis is as follows:Genes are DNA fragments with genetic information.In this thesis,the dynamic method of controllable dynamic classification is improved by category merging.The improved method is applied to the lung gene regulatory network,so that all nodes are effectively divided into important nodes and unimportant nodes.In this thesis,enrichment analysis on lung cancer gene set,OMIM disease enrichment analysis and KEGG pathway analysis were used to prove that key genes represented by important nodes were significantly enriched in lung cancer related gene set and pathway.These key genes provided theoretical guidance for the identification of drug target genes.Transcription factor is a special protein that opens the process of gene transcription.In this thesis,we improve the three-node model selection method based on motif analysis and principal component analysis.The improved method is applied to the lung transcription factor regulatory network,so that all nodes are assigned and sorted.In this thesis,UniProt biological significance analysis,DBTFLC interval coincidence rate comparative analysis and KEGG pathway analysis were used to prove that the key transcription factors represented by the top 30 nodes in the ranking contained more lung cancer information in the database annotations.These key transcription factors represent key genes which can provide theoretical guidance for lung disease treatment and cancer detection.Protein is the product of transcription and translation of gene genetic information.In this thesis,the shortest path analysis method is applied to the protein interaction network of lung by using known proteins related to non-small cell lung cancer and small cell lung cancer,and the nodes with significant enrichment in the shortest path are identified.By using the methods of GeneCards biological significance analysis,lung cancer gene aggregation enrichment analysis and KEGG pathway analysis,it is proved that the key proteins represented by the nodes with significant enrichment in the shortest path are significantly enriched in lung cancer-related proteins and pathways.These key proteins represent the key genes that can provide theoretical guidance for cancer detection and disease treatment research.
Keywords/Search Tags:Complex network, Important node mining, Lung, Structural controllability, Motif, Shortest path
PDF Full Text Request
Related items