| Early detection,rapid diagnosis and effective treatment of cancer has always been one of the goals of modern medicine.The combination of pharmacogenomics and personalized medicine ensures minimal adverse effects and maximum efficacy.This method relies on whole-genome sequencing technology and a large number of clinical trials,in which an-ticancer drug response data(also known as drug sensitivity data)is one of the important data for treatment protocols.However,the actual anti-cancer drug sensitivity data are often missing,damaged and distorted,which is not conducive to the development of personalized anti-cancer personalized medical solutions.Therefore,accurately predicting the sensitivity of tumor cell lines to drugs is a very important research topic.Based on the databases,which is the results of the Cancer Genome Project(CGP)and The Cancer Cell Line Encyclopedia(CCLE)published in Nature,this thesis predicts the sen-sitivity of anticancer drugs.Firstly,based on the similarity hypothesis,that the cell lines with similar gene characteristics have similar responses to similar drugs with similar chem-ical structure,an new anti-cancer drug sensitivity prediction model,MS model,based on matrix complecation and similarity constraint is proposed.The model uses the gene expres-sion data of the cell line and the chemical structure information of the drug to extract drug similarity information and cell line similarity information.Secondly,through the 10-fold cross-validation,it is found that the drug similarity con-straint in the MS model contributed a lot to the prediction,while the cell line similarity constraint contributed little to the prediction.Then,under the premise of not affecting the prediction effect,in order to reduce the complexity of the model,the cell line similarity con-straint in the MS model is deleted,and the MSD model is obtained.The final results show that the MSD model has better prediction performance for the two databases than some of the published classic models.In addition,the sensitivity results predicted by the model are used for the two-class problem(whether the cell line was sensitive or inhibited by the drug),and the accuracy rate reached 0.7 on the CCLE dataset,which is higher than other models.The good predictive performance of the MSD model was verified again.The above results indicate that the MSD model is an effective tool to choose from in predicting anticancer drug response values. |