Font Size: a A A

Multi-Label Feature Selection Algorithms Based On Fisher Score

Posted on:2022-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:Z K WangFull Text:PDF
GTID:2518306485450224Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In multi-label learning,due to rich property description tag results need by a large number of features,at the same time with the tag number of free combination between marking subset growing exponentially with the increase of number of,in the case of limited samples lead to only a few tags subset is sample description and average coverage of these tags subset sample quantity is less,This is reflected in two characteristics of multi-label learning: high feature dimension and unbalanced learning,so most traditional single-label feature selection algorithms can not be directly applied to multi-label learning tasks.In this paper,the classical Fisher Score single marker feature selection algorithm is taken as the research object,aiming at the limitations of single marker Fisher Score algorithm in multi-label learning,the main research contents are as follows:1)Multi-label feature selection algorithm based on Fisher Score(MLFS).In view of the fact that it is impossible to categorize samples directly in multi-label learning,the multi-label learning is divided into multiple single marker learning by random samples,and the correlation information between markers is measured by cosine similarity between markers,and the calculation formula of Fisher Score value of features in multi-label learning is updated.Experimental results show that the MLFS algorithm is effective.2)Multi-label Fisher Score feature selection algorithm based on label similarity(LSMLFS).MLFS algorithm existed in the correlation between spatial decomposition in the process of marking information loss problem,put forward a strategy to measure similarity between tags,in order to calculate a random sample of tag with the mark of the sample space of similarity,the similarity results are equidistant points,realize the marks the transformation of learning to a single tag.The experimental results show that the LSMLFS algorithm is effective.3)Fisher Score fast multi-label feature selection algorithm based on text classification(TCMLFS).In view of the fact that the performance of the Fisher Score algorithm is easily affected by the extreme samples in the class,a strategy of center shift of a kind is proposed.The mark is decomposed by random samples,and the sample space is classified under the mark,and the samples in each class are divided into classes.The farthest distance of the center combined with the distance coefficient is set as the distance threshold,the samples whose distance to the class center is greater than the distance threshold are cut out,and the single-labeled Fisher Score feature selection algorithm is performed in the updated sample space.The experimental results show that the performance of TCMLFS algorithm is further improved compared with MLFS and LSMLFS algorithm.At the same time,the application scope of TCMLFS algorithm is discussed,and it is determined that the algorithm has good effect in text data sets,but poor effect in non text data sets.
Keywords/Search Tags:Fisher Score, multi label learning, feature selection, inter-class divergence, mark similarity
PDF Full Text Request
Related items