Font Size: a A A

Research On Multi-label Feature Selection Based On Fuzzy Mutual Information

Posted on:2022-12-05Degree:MasterType:Thesis
Country:ChinaCandidate:H B ZhongFull Text:PDF
GTID:2518306761459654Subject:FINANCE
Abstract/Summary:PDF Full Text Request
Due to the rapid increase of high-dimensional multi-label data in Internet applications,single-label feature selection algorithms cannot meet the needs of people's daily life and scientific research.The processing of traditional multilabel datasets is to convert multi-label data into single-label data,and then directly use the single-label feature selection algorithm for feature selection,which ignores the degree of correlation between each label in the label space.As another multi-label data processing scheme,the algorithm adaption method solves this problem by optimizing the traditional single-label feature selection algorithm to directly use the multi-label data set.The multi-label feature selection algorithm based on algorithm adaptation method selects the minimum feature subset that discriminates the spatial distribution of labels from the original high-dimensional features,and then provides a dimensionality-reduced data set for multi-label learning,reducing computational cost and improving learning ability.The existing multi-label feature selection algorithms based on fuzzy mutual information integrate knowledge from different fields,but there are still some problems in the feature selection process: 1.The traditional multi label feature selection algorithm focuses on measuring the correlation between the feature set and the label set,which realizes feature selection by scoring the candidate features.However,within the process of feature selection,there is still discrimination information in the selected feature set.If the information is not distinguished,the most discriminating features will be ignored,which will undoubtedly affect the results of subsequent feature selection;2.Multi-criteria decision-making refers to a decision-making method that selects among conflicting or incommensurable solutions.There must be a set of criteria for this method,each of which will affect the results to varying degrees.This is similar to the process of feature selection,that is,the optimal feature of all features under the labeling criterion is selected.As a measure of correlation,fuzzy mutual information can solve the correlation problem in multi-criteria decision-making.Based on fuzzy rough sets,this paper takes fuzzy mutual information as the core.The main work can be summarized as the following points:1.The basic theory of information theory on fuzzy rough sets is expanded,and concepts such as fuzzy conditional mutual information and fuzzy interaction mutual information are added.2.The feature redundancy is refined,label-independent redundancy and label-dependent redundancy are proposed,and the rationality of this division is proved in the follow-up.Then,on the basis of the new feature redundancy division,a weight formula based on feature redundancy is proposed for the selected feature set.So that the entire selected feature set can be considered and a fuzzy equivalence matrix can be constructed.A new fuzzy mutual information measure is adopted to measure the correlation between features,and the rationality proof of the measure is given.3.A multi-label feature selection algorithm named FRFS based on the redundancy of selected feature set is proposed.First,a new multi-label feature selection scoring function is proposed,which comprehensively considers feature correlation and feature redundancy,aiming to select the most discriminative features.Secondly,the results of the comparative experiments show that the performance of FRFS is more superior.Finally,statistical tests are performed to reasonably analyze the differences between FRFS and other comparative algorithms.4.A multi-label feature selection algorithm FMFS based on fuzzy mutual information and multi-criteria decision-making is proposed.First,starting with multi-criteria decision-making,the multi-label feature selection problem is modeled as a multi-criteria decision problem,where feature sets can be viewed as alternatives,and multi-label spaces are treated as different criteria.Secondly,the fuzzy equivalence matrix is established for the feature space and the label space,and the decision matrix and the weight vector are established based on the fuzzy mutual information.The TOPSIS method is used to construct the decision matrix and weight vector.Furthermore,the sorted relative relevance vector(shortly,RC)is the selected sequence.Finally,the results of the comparative experiments show that FMFS has better performance than the other five comparison algorithms.
Keywords/Search Tags:Selected feature set, Fuzzy mutual information, Multi label feature selection, Multi criteria decision making
PDF Full Text Request
Related items