Font Size: a A A

Multi-label Feature Selection Based On Mutual Information

Posted on:2020-11-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y B ZhangFull Text:PDF
GTID:2428330599977447Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,the frequent emergence of multi-label data sets have attracted great attention in the research field.When we deal with multi-label data,redundancy and irrelevance of data will affect classification performance.Therefore,dimension reduction should be carried out.Feature selection is the main method to reduce the impact of dimensionality disaster.Feature selection plays an important role in multi-label classification.Mutual information is a method of feature selection,considering the correlation between labels.Based on the research of mutual information,this paper improves the classification accuracy by giving the ranking of features through mutual information,and establishes a multi-label feature selection algorithm based on mutual information.The main research work of this paper is as follows:(1)For the problem of multi-label feature selection,the general formula of multi-label feature selection based on mutual information is given,but the calculation of conditional mutual information in the formula is highly complex,the conditional mutual information is replaced by approximate formula to calculate.Two sets of fixed values are assigned to the parameters in the general formula,and two different multi-label feature selection algorithms based on mutual information are proposed.By comparing the classification ability of different algorithms with multi-label data sets,and the effectiveness of the two algorithms is analyzed.(2)Considering that most of the current discretization methods discretize the continuous values into finite numbers,the joint probability is expressed by calculating the inner product of the fuzzy equivalent partition matrix between attributes,and the calculation formula of fuzzy mutual information is given.A multi-label feature selection algorithm based on fuzzy mutual information is established.It is tested by multi-label data sets.(3)In order to avoid the situation that the selected subset is only local optimum but not global optimum when selecting the best feature subset.The feature selection problem is transformed into a numerical optimization model by simplifying the mutual information calculation formula between tags and attributes.Considering the influence of the geometric structure of feature manifolds on the results,a method based on global numerical optimization is established.Quadratic programming is used to solve the problem.The validity of the algorithm is verified by multi-label data sets.
Keywords/Search Tags:multi-label, feature selection, entropy, mutual information, fuzzy mutual information
PDF Full Text Request
Related items