| Drug development is a long and complex process that takes a lot of time and resources.The process of drug development usually starts with the discovery of a potential therapeutic target,followed by the screening of a group of possible compounds,and then a series of tests and evaluations such as pharmacodynamics and safety before finally entering clinical trials.However,even when entering clinical trials,the success rate of a drug is low,so there is a need to find more efficient and accurate methods to predict new indications for a drug.Drug repositioning as a drug development strategy can shorten drug development cycles and reduce development costs by leveraging existing drugs or compounds and finding their potential role in the treatment of other diseases.Successful drug repositioning requires finding potential connections between drugs and diseases,a process that often requires a combination of extensive bioinformatics data and analytical methods.In the past few years,deep learning has been widely used in various fields,such as in drug development for molecular structure design and optimization,drug screening and evaluation,prediction of biological activity and efficacy of drugs,and personalized drug treatment plans.And graph neural network,as a deep learning method based on graph structure,is widely used in the field of drug R&D because of its ability to handle unstructured data,strong interpretable accuracy and scalability,etc.In this study,we propose a drug-disease association prediction method based on collaborative filtering of graph neural networks,trying to obtain information in the drugdisease interaction network and combine drug similarity to obtain better prediction performance.Specifically,this study first extracts collaborative signals between drugs and diseases from drug-disease interaction data,and then transforms these collaborative signals into embedding vectors of drugs by graph neural networks.These embedding vectors can be used not only to predict the therapeutic relationships between drugs and diseases,but also to calculate the similarity of therapeutic relationships between drugs.In addition,factors such as drug chemical structure,protein and side effect similarity are considered in this study to more comprehensively and accurately predict the novel effects of drugs.Ultimately,the method proposed in this study improves the prediction accuracy by about 18% over existing collaborative filtering methods on the same data set.The method proposed in this study has the following components.(1)Processing of drug-related data and extraction of collaborative signals.Relevant drug data(chemical structure,protein and side effects)and drug-disease interaction data are first obtained,and then missing values in the data are added from relevant websites,and then synonym conversion is performed for drug names and drug names in different data are mapped to each other.Afterwards,for drug-disease association data,the data that are not conducive to higher-order connectivity to obtain collaborative signals are processed and removed,but the removed data can be used for later validation.Using the embedding propagation layer of the graph neural network,the hidden collaboration signal in the drug-disease interaction data is obtained by continuous training,which refines the embedding vector of drugs and can reflect the therapeutic relationship characteristics of drugs.(2)Construction of drug repositioning prediction model based on collaborative filtering of graph neural networks.Machine learning and biological network technology approaches calculate similarity from correlation data between drugs or use drug-disease associations to build graph relationships for prediction,with little attention to information hidden in known drug-disease associations.This study will use graph neural networks to enhance collaborative filtering by merging drug affinity with traditional inter-drug similarity,using multilayer embedding vectors of drugs to calculate inter-drug affinity after Similarity of drug-related data(chemical structure,protein and side effects)is calculated using three types of cosine similarity,Jaccard similarity and Smith-Waterman sequence comparison similarity,and neighbor selection is performed,and finally new indications of the drug are recommended based on the neighbors of the target drug.In addition,the gold dataset used in this study lacked recent clinical results,so the false positive results in the results were compared with the extant clinical trials,and more predicted results for new indications of drugs were found to have proven treatability,again verifying the validity of the model. |