Font Size: a A A

Research On Feature Selection Method Based On Multi-label Learning Theories

Posted on:2022-11-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:J C HuFull Text:PDF
GTID:1488306758979239Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the popularization of informationalized and mechanized equipment and the improvement of computer storage capabilities,more and more high-dimensional feature data can be saved.While these high-dimensional data provide massive amounts of information for multidomain applications,they also cause dimensional disasters.In addition,high-dimensional multi-label data is one of the current research hotspots in the field of machine learning.Multilabel data means that a sample is associated with multiple semantics simultaneously,and it is widely used in text,music,genes and other fields.This paper uses feature selection technology to reduce dimensionality and classify and model high-dimensional multi-label data.Feature selection technology can not only extract key features from high-dimensional features for classifier modeling,but also help researchers understand the model better and improve the efficiency of model execution.Therefore,a large number of multi-label feature selection techniques have been proposed.Generally,there are three kinds of feature selection algorithms:wrapper algorithms,filtering algorithms and embedded algorithms.Wrapper algorithms use the accuracy of the classifier as a metric to measure the pros and cons of a subset of candidate features until the optimal candidate feature subset is selected;Filtering algorithm means that the feature selection process is independent of subsequent classifiers,and it often combines information theories and other measurement methods to design the objective function;the embedded algorithm combines the objective function with the feature selection process,which can improve the execution efficiency.Based on supervised learning,this paper conduct studies from embedded method and filter method.Our work comprehensively analyzes the pros and cons of existing sparse algorithms,and proposes two new multi-label feature selection algorithms: robust multi-label feature selection based on dual-graph regularization(DRMFS)and dynamic subspace dual-graph regularized multi-label feature selection(DSMFS).Then,according to the limitation of filtering feature selection algorithm based on mutual information,an optimization method is proposed and a new feature selection method is designed: feature redundancy maximization(FRM).The main contributions of this article can be summarized in the following four aspects:1.Propose a multi-label feature selection algorithm based on double graph regularization(Robust multi-label feature selection with dual-graph regularization,DRMFS).DRMFS has only one unknown variable.Therefore,an optimized multiplicative gradient descent algorithm is proposed according to the objective function,and the global optimal solution can be obtained.The convergence of the gradient descent algorithm is proved simultaneously.2.Further,according to the limitation of the existing algorithm(DRMFS)that fixes Laplacian matrix,this paper proposes a multi-label feature selection algorithm based on dynamic subspace(Dynamic subspace dual-graph regularized multi-label feature selection,DSMFS).The algorithm constructs a dynamic Laplacian matrix and compares it with 7 advanced algorithms on 12 data sets.The result of the experiment proves the superiority of DSMFS.3.A multi-label feature selection algorithm based on feature redundancy of interactive information is proposed(Feature Redundancy Maximization,FRM).The algorithm combines the cumulative summation of conditional mutual information with the 'maximum of the minimum' criterion.It overcomes the limitation of traditional filtering feature selection algorithm based on information theory.Compared with 6 algorithms on 14 data sets,the experiment proves the superiority of FRM.
Keywords/Search Tags:Multi-label learning, Feature selection, Machine learning, Graph Laplacian matrix, Graph regularization, Non-negative matrix factorization, Information theory
PDF Full Text Request
Related items