Font Size: a A A

Application Of R-based Classifier Model In Multi-Drug Resistance Reversal Research

Posted on:2020-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:M Y HuangFull Text:PDF
GTID:2404330599961195Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
In this paper,a single classifier,combination classifier,random combination classifier and support vector classifier model were established based on the decision tree and support vector machine in data mining to classify the multi-drug resistance data of leukemia cells and determine whether the multi-drug resistance of leukemia cells has been reversed.The main contents of the study include.Data reduction processing.A variety of classifier models were established,model parameters were optimized,model prediction errors were reduced,model prediction accuracy was compared,and the optimal model was selected.In this paper,the cell characteristic data of multidrug resistance of leukemia cells under the action of reverser were pretreated,and then the decision tree,support vector machine,random forest model and other classification algorithms were applied and compared on the processed data set.In data preprocessing,data dimensions are reduced to optimize the sample set by removing variables whose values are close to constants,removing highly correlated independent variables,eliminating outliers,and data standardization.A single decision tree model was established on the preprocessed data set,and the prediction accuracy of C4.5,C5.0,CART and Rpart decision tree models were compared,and the visualization operation was performed.The combination classifier model was established,and Bagging and adaboost algorithms were used to build the model.The efficiency and prediction accuracy of the model were compared,and the weighted voting was used to explore the prediction results.The random combination classifier was established,and the random forest classification algorithm was used to predict the test sample set by randomly selecting the test sample set and a single decision tree.A support vector classifier model was established to find a hyperplane to segment the sample set,find a maximum interval,and make classification prediction for the sample set.The research results of this paper show that in the process of data preprocessing,the removal of redundant variables reduces the dimension of the sample set,saves the time of model construction,and has little impact on the accuracy of model prediction.Among all classifier models,the random forest model performs best,with shorter model construction time and higher prediction accuracy.
Keywords/Search Tags:Multi-drug resistance, Decision tree, Combinatorial classifier, Random forest classifier, Support vector classifier
PDF Full Text Request
Related items