Today’s drug development efforts are becoming increasingly labor-intensive,time-consuming and costly,and the number of filing slots is decreasing each year,making it increasingly difficult to conduct drug development.Thanks to the experience of previous scientific work and the development of big data-based data exchange,there is a growing tendency to find a low-cost and efficient method to apply in the early stages of drug development to provide potential indications for existing drugs with high confidence intervals,a technique known as drug repositioning.Although a number of researchers have proposed predictive models of great research value,however,most drug repositioning methods are still lacking in terms of techniques for comprehensive mining and scientific exploitation of drug similarity and disease similarity.In order to solve these problems,this thesis proposes two models to explore them respectively.The specific research contents are as follows:(1)A bipartite graph diffusion algorithm with multiple similarity mining for drug-disease association prediction(BGMSDDA): In the first step,the drug-disease association matrix is reconstructed using the weighted K nearest known neighbors(WKNKN)algorithm;In the second step,an effective method was designed to extract similar characteristics of drugs and diseases based on integrating linear neighborhood similarity and Gaussian kernel similarity;In the third step,bipartite graph diffusion was used to infer undiscovered drug–disease associations.After carrying out 10-fold cross-validation experiments,BGMSDDA showed excellent performance on two datasets,specifically with AUC values of 0.939 ± 0.001(Fdataset)and 0.954 ± 0.001(Cdataset),and AUPR values of0.466 ± 0.001(Fdataset)and 0.565 ± 0.001(Cdataset).Furthermore,to evaluate the accuracy of the results of BGMSDDA,we conducted case studies on three medically used drugs selected from Fdataset and Cdataset and validated the predictive associated diseases of each drug with some databases.Based on the results obtained,BGMSDDA was demonstrated to be useful for predicting drug–disease associations.(2)Compressed sensing algorithm for drug-disease association prediction based on central kernel alignment multiple kernel learning(DRPADC): In the first step,for the sparse problem of the original matrix,the WKNKN algorithm is used to process the original correlation matrix to reduce the sparsity of the correlation matrix;In the second step,in order to use various similarity information scientifically,multiple similarity information is fused by the central kernel alignment multiple kernel learning(CKA-MKL)algorithm;In the third step,a compression sensing algorithm is used to predict the potential drug-disease association scores;To validate the performance of DRPADC,we used the drug-disease datasets Fdataset and Cdataset as the gold standard datasets for validation.By performing10-fold cross-validations,DRPADC also showed good performance on both datasets with AUCs of 0.941 ± 0.001(Fdataset)and 0.955 ± 0.001(Cdataset),AUPRs of 0.521 ± 0.001(Fdataset)and 0.607 ± 0.001(Cdataset).In addition,to test the utility of DRPADC,three medical drugs were selected for case studies in Fdataset and Cdataset,respectively,and validated by existing public databases and some research literature.The experimental results above confirm the accuracy and reliability of our method.(3)In this thesis,the BGMSDDA and DRPADC models are applied to predict potential therapeutic drugs for SARS-Co V-2.First,the two models were predicted against the Human drug-virus association database(HDVD)under the ten times 10-fold cross-validation framework.The corresponding AUC and AUPR index values were then calculated based on the prediction results and compared with some of the latest models.In order to further verify the applicability of the two models to HDVD,SARS-Co V-2 was selected as a case study in this thesis for case study.The above experimental results show that the BGMSDDA and DRPADC models exhibit excellent predictive ability in the field of drug repositioning. |