Font Size: a A A

Study Of The Cancer Driver Gene Identification Based On The Genetic Network Modelling

Posted on:2021-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y DaiFull Text:PDF
GTID:2370330623958506Subject:Engineering
Abstract/Summary:PDF Full Text Request
Cancer has become a major public health problem and one of the leading causes of death.Among them,the mining of cancer driver genes has always been an important research direction.Meanwhile,precision medicine based on the driver genes is also an important means of cancer treatment.With advances in gene sequencing technology,millions of somatic mutations have been reported in the past decade.However,mining the driver genes with oncogenic mutations from these data is still a very challenging research task.To this end,many advanced algorithms have been proposed to identify driver genes,but few attempts have been made to combine network structure information with biological information.This paper is based on the complex network and machine learning methods to study the mining of cancer driver genes.It innovatively combines a variety of feature extraction and comparative analysis methods,and mining gene features from the three aspects of gene network features,gene attribute features,network and attribute integration features,and demonstrates the feasibility of the research through comparative analysis from different perspectives.Finally,based on the improved classification of random forest algorithm,the important factors affecting the occurrence and development of cancer are revealed,and then the potential cancer driver genes are identified,which provides guidance for the clinical research of cancer and the mining of driver genes.The main work includes:(1)Cancer gene network analysis based on complex network theory.Constructing cancer gene network,analyzing the changes of network structure in the process of cancer occurrence and development and mining the network features of genes are the first issues to be studied in this paper.The analysis of the network structure includes the comparison of the changes in the network structure of the driver genes in the Normal network and the Tumor network,the binding mechanism of the driver genes,and the distribution of the eigenvalues of the driver genes and the non-driver genes in the Tumor network.(2)Cancer driver gene prediction based on complex networks and machine learning.The research of cancer driver gene mining algorithm is another important issue in this paper.This module mainly studies and analyzes the importance of single feature,the importance of structural features and non-structural features,the differences between using gene network and not using gene network,and the prediction results of the model.The results of mining seven different types of cancer driver genes show that the algorithm proposed in this paper can always achieve high prediction accuracy,the AUC scores of the model are 0.987,0.991,0.994,0.995,0.989,0.989 and 0.986,and the overlap ratio between the prediction results and the Cancer Gene Census(CGC)database can reach 40% and above,which are superior to the existing advanced method.Further analysis also shows that the integration of network features is beneficial to the mining of cancer driver genes.
Keywords/Search Tags:cancer driver gene, complex network, machine learning, cancer gene census
PDF Full Text Request
Related items