Accurately identifying the interactions between genomic factors and the response of cancer drugs plays important roles in drug discovery,drug repositioning and cancer treatment.A number of studies revealed that interactions between genes and drugs were ‘many-genes-tomany drugs’ interactions,i.e.common modules,opposed to ‘one-gene-to-one-drug’interactions.Such modules fully explain the interactions between complex biological regulatory mechanisms and drugs.Therefore,this thesis makes a review of the existing research algorithms and proposes a new modular feature matching method of high dimensional omics data based on network analysis.First,this article introduces the significance of research and application value of common module identification.After that,from the perspective of machine learning,it provides a detailed evaluation of three types state-of art common module identification methods,including methods based on non-negative matrix factorization,partial least squares and network analysis.Subsequently,in view of the shortcomings of these methods,this paper proposes a new high-dimensional omics data modular feature matching model based on network analysis and gives a solution to the model.The model uses high-order similarity tensor,hypergraph prior knowledge network constraints and sparse constraints to jointly optimize obtaining common modules.Finally,the experimental results and analysis are carried out using experiments on simulated data and real world data.Through the comparison of two sets of experiments on noise contamination and outlier interference,it is proved that the model proposed in this paper has good properties in both scenarios.The real experiment use 2091 gene expression data and 101 drug response data on 392 cell lines.After comparing with a number of cutting-edge methods for biological validation,it is proved that the method proposed in this article has good properties and the output results have biological significance.The main contributions of the model proposed in this paper are: 1)Using high-order similarity tensor,which reflecting the many-to-many relationship weakens the interference of noise and outliers in the input data.2)The acquisition of the common module is integrated into the iterative optimization of the objective function,which solves the shortcomings of the decoupling strategy of the current methods.3)Use hypergraph to fuse multiple prior knowledge network,which improves the effect of prior knowledge constraint compared with a single prior knowledge network. |