Font Size: a A A

Research On Feature Selection Based On Matrix Factorization And Dictionary Learning

Posted on:2021-09-25Degree:MasterType:Thesis
Country:ChinaCandidate:J Z SongFull Text:PDF
GTID:2518306050970839Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the advent of the information era,information technology is advancing by leaps and bounds,accompanied by the production of massive data.Not only the amount of data,but also the features and dimensions of the data have been increased.Obviously,the larger the amount data and the higher the dimension,the more redundancy and noise that can affect the information.In fact,only a few or even a bit of features may be much valuable for a string of high-dimensional data,and most of the features could be ignored.For this reason,it is necessary to simplify and denoise high-dimensional data before using them and then extract a small part of the valuable data features.This process is usually called dimensionality reduction.Feature selection is an important branch of machine learning.As a means of dimensionality reduction with good physical significance,it has been widely studied and adopted.Its purpose is to select some representative features from the original data through a certain rule.The method of feature selection has been required in many fields,such as data mining,machine learning,pattern recognition,etc.Some novel feature selection algorithms have been widely proposed in recent years.However,many of these algorithms do not retain the information of the data completely,so the accuracy of feature selection is not very good.Based on the above problems,this paper studies some feature algorithms from the perspective of data geometric structure information and data reconstruction.The main research contents of this thesis are as follows:1)A new feature selection algorithm called double feature selection algorithm based on low-rank sparse non-negative matrix factorization(NMF-LRSR)is proposed.Firstly,to reduce the dimensions effectively,NMF-LRSR uses non-negative matrix factorization as the framework to further reduce the dimension of the feature selection which is originally a dimension reduction problem.Secondly,the low-rank sparse representation with the self-representation is used to construct the graph,so both the global and intrinsic geometric structure information of the data could be taken into account in the process of feature selection,which makes full use of the information and makes the feature selection more accurate.In addition,the double feature selection theory is used to this paper,which makes the result of feature selection more accurate.2)A novel feature selection algorithm called double-dictionary learning unsupervised feature selection cooperating with low-rank and sparsity(LRSDFS)is proposed.Firstly,LRSDFS improves the traditional dictionary learning by synchronously reconstructing the original dataset into two dictionaries simultaneously.Secondly,the low-rank and sparsity constraint are applied to the two dictionaries,so that the reconstructed dictionary can retain the global and local information of the original data simultaneously.Finally,the global and local information are weighted to realize the feature selection of the dataset,making the selected features more reasonable and more interpretable.3)A feature selection called local structure preserving unsupervised feature selection based on low-rank dictionary learning(LSPLDFS)is proposed.First,LSPLDFS uses a dictionary learning model to reconstruct the original data,and adds low-rank constraint to dictionary learning to obtain a low-rank dictionary that retains the global information of the original data.Second,the algorithm introduces the concept of graph regularization,constructs a graph on feature manifold which describes the local manifold structure of the feature space,and retains the local structure information by using spectral embedding method.Finally,the algorithm applies sparse constraints to the feature selection matrix to make the discriminative information more accurately used.
Keywords/Search Tags:Non-negative matrix factorization, low-rank sparse representation, self-representation, dictionary learning, low-rank constraint, sparse constraint, local structure retention, feature selection
PDF Full Text Request
Related items