| The bioactivity of compounds is one of the most concerned features in the screening and design of lead compounds,which is a key step in the process of drug research and development.Traditional drug research and development methods quantitatively evaluate the bioactivity of small molecules binding to targets through biological experiments,which often requires a lot of manpower,material and financial resources.At present,the mainstream ligand based virtual screening method is to predict the bioactivity of ligand through machine learning method,and then select the those with high activity value for further lead compound screening.However,the success of machine learning in ligand-based virtual screening often depends on abundant samples.When the information of known active ligand samples is insufficient,it is often difficult to obtain good performance.This paper proposes a graph neural network based virtual screening method coupling with molecular similarity to solve this problem.Based on molecular similarity,this method introduces triple loss to impose additional constraints on the prediction of the regression model,combined with domain knowledge to help model training to improve the performance.We conducted experiments on five selected groups of orphan GPCR datasets.Compared with six graph neural network algorithms,our algorithm has the best R~2,RMSE,MAE and Tau.In addition,we removed the molecular similarity constraint module in the algorithm for comparative experiments.It was found that after adding molecular similarity constraints,R~2increased by 66.25%,RMSE increased by 12.72%,MAE increased by 18.63%and tau increased by 48.46%on average,which verified the effectiveness of our method.Because the actual virtual screening scenario is screening in a large number of inactive samples,this paper proposes a graph neural network virtual screening method based on negative sample transfer.This method consists of homologous negative sample transfer,domain confrontation network and regression classification model.It is hoped that through the negative sample transfer and the improvement of the model,the graph neural network model can better distinguish negative samples,to improve the ability to learn data features.Similarly,we conducted experiments on five groups of orphan GPCR data sets.Compared with six graph neural network algorithms,our algorithm is superior to other algorithms in R2,RMSE,MAE and Tau.Compared with the algorithm without negative sample migration correlation module,R2 increased by 94.22%,RMSE increased by 51.81%,MAE increased by 25.6%and Tau increased by 69.48%on average.And we also compare it with the conventional transfer learning method and the graph neural network method based on molecular similarity,to further verify the effectiveness and superiority of the algorithm.Based on molecular similarity and negative samples transfer,this paper proposes two kinds of model frameworks to improve the learning ability of graph neural networks in small sample scenes from different views,which is very valuable in some actual virtual screening scenes. |