With the leap in technology,data has become an important production factor in var-ious fields and industries in modern society.The data-driven method is used to classify the collected data set of a certain problem,which can effectively process these data and mine its potential information.The traditional classification method needs to assume in advance that the research data obeys the Gaussian distribution,but the data set in real life often does not meet this assumption.This paper uses the Minimax Probability Ma-chine(MPM)to solve this problem,and put improvement methods for the deficiency of this method.The original minimax probability machine used all the features of a sample to calcu-late the objective function,and does not consider the importance of the sample features on the final classification result,that is,not all sample features are strongly related to the final classification result.Therefore,this paper uses three feature weighting methods to measure the importance of each feature in the research data set and calculate the feature weight matrix.Then the feature matrix is combined with the original minimax proba-bility machine to establish a feature weighted minimax probability machine model for classification and discrimination research.In order to investigate the effectiveness of our proposed method,this paper by comparing the accuracy of eight methods: the original minimax probability machine,the feature-weighted minimax probability machine(FWMPM)based on three weight-ing feature methods,and the KNN,Naive Bayes,Logistic Regression,SVM,Decision Trees on the ”Breast Cancer Wisconsin Diagnostic ” dataset,we find that the average accuracy of FWMPM based on the information gain is the highest,reaching 97.2%.It can be concluded that the feature-weighted minimax probability machine proposed in this paper has achieved certain results on specific data sets,and this method is applicable to real-life data sets. |