| The exploration of important biomarkers associated with cancer development is crucial for diagnosing cancer,designing therapeutic interventions,and predicting prognoses.The analysis of gene co-expression provides a systemic perspective on gene networks and can be a valuable tool for mining biomarkers.The main objective of co-expression network analysis is to discover highly synergistic sets of genes,and the most widely used method is weighted gene coexpression network analysis(WGCNA).With the Pearson correlation coefficient,WGCNA measures gene correlation,and uses hierarchical clustering to identify gene modules.The Pearson correlation coefficient reflects only the linear dependence between variables,and the main drawback of hierarchical clustering is that once two objects are clustered together,the process cannot be reversed.Hence,readjusting inappropriate cluster divisions is not possible.Existing co-expression network analysis methods rely on unsupervised methods that do not utilize prior biological knowledge for module delineation.Here,we propose a weighted gene co-expression network analysis method based on sample characteristics constraints,which utilizes knowledge injection semi-supervised learning(KISL)to identify outstanding modules in the co-expression network.This method addresses the issues raised by current GCN-based clustering methods by utilizing prior biological knowledge and semi-supervised clustering methods.To measure the linear and nonlinear dependence between genes,we introduce a distance correlation due to the complexity of the gene-gene relationship.Five RNA-seq datasets of cancer samples are used to validate its effectiveness.In all five datasets,the proposed algorithm outperformed WGCNA when comparing the silhouette coefficient,Calinski-Harabasz index and Davies-Bouldin index evaluation metrics.According to the results,KISL clusters had better cluster evaluation values and better gene module aggregation.Enrichment analysis of the recognition modules demonstrated their effectiveness in discovering modular structures in biological co-expression networks.In conclusion,the proposed algorithm can be used as an alternative to biological coexpression network analysis for weighted co-expression network analysis based on similarity measure. |