Font Size: a A A

SMILESynergy: A Method Of Anticancer Drug Synergy Prediction Based On Transformer Pre-training Model

Posted on:2024-08-22Degree:MasterType:Thesis
Country:ChinaCandidate:L Q ZhangFull Text:PDF
GTID:2544307139455944Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Synergy of drug combinations can solve the problem of acquiring resistance to single drug therapies in the treatment of complex diseases,especially in the field of cancer therapy,where drug combination therapies are widely used.Because of the complex interactions that may exist between different components of drug combinations,it is of critical importance to predict drug-drug interactions quickly and accurately to design effective drug combinations.Traditional methods for drug interaction studies require extensive laboratory tests and clinical trials,which are costly,time-consuming and inefficient,especially in anti-cancer drug development.To address this problem,there is a trend to use computational methods such as machine learning and deep learning for high-throughput screening tests,as these methods can effectively shorten drug development cycles and reduce costs.Research on anticancer drug synergy prediction is mainly based on traditional machine learning and deep learning methods,in which the chemical molecules of drugs are operationally characterized as input data,and then a trained neural network model is used to obtain the final prediction results,and most of the innovations in these studies come from changes in the structure of the prediction model and pre-processing of the data.With the success of deep learning in the field of NLP in recent years,the pre-trained model plus downstream task scheme represented by Transformer has started to become popular in the fields of medicinal chemical reaction prediction,medicinal chemical synthesis and drug molecule optimization.This approach starts with a pre-training process in which large-scale datasets of drug compounds and related biomedical literature can be used in order to learn the semantic representation of drug molecules.In downstream tasks,the pre-trained models can be combined with appropriate structures(e.g.,fully connected networks,convolutional neural networks,or residual neural networks,etc.)to learn drugdrug interactions and give prediction results.Some researchers have tried to apply Transformer models to the field of drug synergy prediction,but since the input data are not in text form,the Transformer codec structure in the main part of the model does not use word vector embedding and positional encoding,which makes the model lose the ability to understand the molecular structure of drug molecules with the molecular distance features in the combined drug sequence.To address this limitation,we propose a new method: SMILESynergy.The method takes the simplified molecular linear input specification(SMILES)of textual data of drugs as input and uses Transformer as a pre-trained model to predict the synergy of drug combinations.SMILES is a commonly used molecular structure representation that can convert complex chemical structures into simple strings to facilitate data processing and analysis.The SMILES Enumeration technique allows for data enhancement of drug combinations,thus improving the robustness and generalization of the model.In the SMILESynergy approach,the pre-training model encodes drug data using a Transformer model including word vector embedding and positional encoding.Transformer is a neural network model based on a self-attentive mechanism with strong expressive power and self-adaptability,which is suitable for tasks such as processing natural language processing and sequence data.In the pre-training process,the model extracts potential features in drug molecules by learning a large amount of unlabeled data to achieve better results.In downstream tasks,the pre-trained model is connected to a Multilayer Perceptron(MLP)for regression and prediction of synergistic effects of drug combinations for classification tasks.Through experimental validation on the O’Neil and NCI-ALMANAC datasets,the SMILESynergy model has a mean square error of 51.3 in regression analysis and an accuracy of 96.9% in classification analysis.Its accuracy was higher than other common drug combination prediction models,such as Deep Synergy and Mulinput Synergy.This indicates that the SMILESynergy method has better practicality and applicability,and can provide new ideas and methods for drug combination research and anti-cancer drug development.
Keywords/Search Tags:Synergy, Deep Learning, Attention, Transformer
PDF Full Text Request
Related items