| Cancer is a serious threat to human health.A major challenge in the field of cancer research and treatment is how to improve the cure rate and survival rate of cancer patients.Medical research has found that cancer is not caused by a single gene abnormality,but by the synergy of multiple disease-causing genes.Cancer functional modules are sets of genes that lead to the occurrence,development and progression of cancer.Identifying cancer functional modules can not only deeply study the pathogenesis and evolution of cancer,but also guide the clinical diagnosis and treatment of cancer.Therefore,the accurate and effective identification of cancer functional modules in massive data has become a hotspot in the field of cancer research.In this study,complex relationships between genes are represented by complex network models,which in turn identify multiple specific subnetworks.The study of the relationship among member genes within the subnetwork can systematically reveal the pathogenic mechanism of cancer,thus providing help for formulating new strategies for cancer treatment.In view of this,the main research contents of this paper are as follows:(1)Research on identification algorithm of carcinogenic driver modules based on network modelThe oncogenic driver module is a set of genes that lead to cancer in the precancerous stage.Aiming at the low accuracy of the existing carcinogenic driver module recognition algorithms,this study propose a new identification method of driver modules,named ECSWalk.Firstly,an undirected weight network is constructed based on the characteristics of high mutual exclusion,high coverage and high topology similarity among genes.Secondly,a directed weight network is constructed using a restart random walk method,and the strong connectivity principle of the directed graph is utilized to create the initial candidate modules with a certain number of genes.Finally,the large modules in the candidate modules are split using induced subgraph method,and the small modules are expanded using a greedy strategy to obtain the optimal driver modules.Compared with the MEXCOWalk and Hot Net2 algorithms,the Accuracy value of the ECSWalk algorithm on pan-cancer data is 27.78% higher on average than the Hot Net2 algorithm with the second best performance,and the F-measure value is 35.18% higher than the MEXCOWalk algorithm with the second best performance on average.The experimental results show that the ECSWalk algorithm can not only identify oncogenic driver modules more efficiently and accurately,but also identify new candidate gene sets with higher biological relevance and statistical significance.The findings have theoretical and practical value for cancer diagnosis,treatment and drug targets.(2)Research on identification algorithm of cancer dysregulated modules based on network modelMost cancers are quite occult.It is very likely to lead to the further development and progression of cancer due to the untimely diagnosis and treatment of early cancer.The dysregulated module is the set of genes that have a significant impact on the development and progression of the cancer during the stage of cancer progression.Aiming at the low accuracy of existing cancer dysregulated module identification algorithms,this study proposes a network model-based identification algorithm for cancer dysregulated modules,Netkmeans.Firstly,an undirected weight network is constructed based on the characteristics of high mutual exclusion,high coverage and high aggregation among genes.Secondly,the number of clusters is selected using the newly created K-value comprehensive evaluation function.Finally,K-means clustering method is applied to identify the optimally deregulated modules.Compared with the identification results of the IBA and CCEN methods,the results of the Netkmeans algorithm have higher statistical significance and biological correlation.Compared with MCODE,CFinder and Cluster ONE algorithms,Netkmeans algorithm on TCGA endometrial cancer data is 16.6%,25% and16.6% higher than the next best performing MCODE algorithm in Precision,Accuracy and F-measure values,respectively.The experimental results demonstrate that in an exploratory analysis on endometrial cancer data from TCGA,several dysregulated modules identified by the Netkmeans algorithm are crucial in the development and progression of endometrial cancer.The findings play a crucial role in the precise diagnosis,treatment and development of new drugs for cancer patients. |