Feature Weighting Method For Binary Classification In Machine Learning

Posted on:2021-04-20

Degree:Master

Type:Thesis

Country:China

Candidate:T F Wang

Full Text:PDF

GTID:2517306113453484

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

Regarding the study of classification models in machine learning methods,most of the existing research on classification variables has focused on variable selection.As the basis of high-dimensional statistical modeling,the importance and necessity of variable selection in processing large-scale high-dimensional data are unquestionable.However,for low-dimensional data,when the total number of variables available for analysis is not large,variable selection sometimes leads to a lack of effective information for the overall classification,which affects the classification accuracy.At the same time,the existing binary classification in machine learning usually assumes that each feature has the same impact on the categorical variable and builds a classification model without considering the possible differential impact of the features on the categorical variable which is not sure for most of the cases.Based on these issues,this paper focuses on feature weighting methods,mainly studying variable weighting in binary classification models.That is,the corresponding weights are given to the features of the model to improve the classification accuracy.The main research contents and conclusions of this paper are as follows:Firstly,this paper proposes a variable weighting method based on mutual information and applies this method to classic machine learning classification algorithms such as Naive Bayes,decision trees,K-nearest neighborhoods and random forests.Secondly,the performance of each weighted classifier was tested by experiments on the Wisconsin Breast Cancer Dataset and the Blood Transfusion Information Dataset provided by the Blood Transfusion Service Center from UCI machine learning repository.The experimental results show that for binary classification tasks,the weighted machine learning methods proposed in this paper tend to outperform the corresponding traditional methods in terms of classification accuracy.Finally,this paper verifies the effectiveness of the weighting method based on mutual information for machine learning models.This method has the following advantages: first,the method is based on information theory,so the weight measurement results are reliable;second,the method does not negatively affect robust classifier,thus,the weighting method in this paper can be used for multiple classification models;finally,this method can improve the classification accuracy of several traditional classifiers,which can play an important role in practical applications.

Keywords/Search Tags:

Feature Weighting Method, Naive Bayes Classifier, Decision Tree, K-Nearest Neighbors, Random Fore

PDF Full Text Request

Related items

1	Chinese Text Categorization Method And Implementation
2	The Research On Masters' Employment Will Of A Normal University
3	Improved Naive Bayes Algorithm With Application To Text Classification
4	Research On Offline Chinese Signature Recognition Based On Multi-resolution Feature Fusion
5	Timing Strategy Of Rebar Commodity Futures Based On Decision Tree
6	Application Of Decision Tree Optimization Algorithm In Rapid Analysis Of Near Infrared Spectroscopy
7	Research Of Bayesian Networks Classifier With Continuous Attributes
8	A Study Of Copula-Based Decision Tree With Applications
9	Testing For Multivariate White Noise Under Unknown Dependence Based On Random Weighting Bootstrap Method
10	Analysis Of Tennis Skills And Tactics From The Perspective Of Statistics