| Machine learning and deep learning algorithms are widely used in various fields.The combination of algorithm and medical field will greatly help the research of patient data in reality.Improving the prediction algorithm of drug action mechanism is conducive to more accurate positioning of drug action mechanism,thus greatly reducing the workload of doctors and promoting drug research and development.In this thesis,some descriptive statistical analysis was conducted on the characteristics of some genes and cells of patients and the mechanism of 206 drugs provided by Harvard University,such as exploring the correlation between variables.Firstly,principal component analysis is used to linearly combine the features to extract key uncorrelated features.Secondly,removing low variance features.Thirdly,the clustering results are treated as a new feature according to the K-means algorithm.Finally,some statistics such as mean,variance and square are added for feature combination and feature selection.This thesis mainly uses XGBoost,neural network and Tabnet for multi-label classification.Cross entropy,AUC,F1 macro and F1 micro were used as the main indicators to evaluate the model.Among XGBoost,XGBoost based on Binary Relevance(BR)method and stacking idea achieves the best results.MLSMOTE as a data synthesis method exclusively for multi-label classification,is used to sample the data to generate new samples.In the neural network,its effect is not as good as that based on the idea of Classifier Chains(CC).After using MLSMOTE in Tabnet model,the effect is improved.The best model is the three-layer neural network based on the idea of Classifier Chains(CC).The cross-entropy on the test set is 0.01480,the AUC is 0.8234,the best prediction effect of F1 macro is 0.5781 and F1 micro is 0.99735.In the actual drug action mechanism research,the drug acts on the sample.According to the changes of the sample gene and cell data,the best algorithm is used to predict the results of the 206 drug action mechanisms,and then the targeted drugs can be developed according to the drug action mechanism. |