Font Size: a A A

The Research On Feature Selection Algorithm

Posted on:2020-06-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y H ZhuFull Text:PDF
GTID:2428330578460831Subject:Information processing and communication network system
Abstract/Summary:PDF Full Text Request
No matter in the field of science research or industrial applications,it tightly relates to various data.As the increase of precision requirement,high-dimensional data get more popular.However,high-dimensional data significantly increase the storage cost and computation cost,and its redundancy degrade the performance of machine learning model,such that leading to the inaccuracy of acquired knowledge.The inherent recognition model is usually represented by a part of features,which means that the high-dimensional data contain numerous redundancies,while the features used for decision only occupy a small part.Therefore,it demands us to appropriately preprocess the high-dimensional data before conducting the data mining or knowledge discovery,i.e.,to reduce dimensionality of data and avoid the affection of redundancy,so as to improve the accuracy of data-processing techniques,such as classification,regression and clustering.Thus,the dimensionality reduction techniques have been playing an extremely important role as an indispensable preprocess method in machine learning,model recognition,industrial applications and science research.Subspace learning and feature selection are two main approaches to conduct dimensionality reduction.Specifically,subspace learning is a robust projection model and feature selection is an interpretable selection model.This paper intends to embed the subspace regularization term into the sparse feature selection framework to propose novel feature selection algorithms with both robustness and interpretability.To do this,1.An unsupervised feature selection algorithm is proposed based on feature-level self-representation and two sparse penalty regularization terms.The self-representation is used to reconstruct data,and then the l2,1-norm regularization and l1-norm regularization are used to conduct the sparse penalty on the reconstruction coefficient matrix,so as to double filter redundant features and prove the preserved features all are important.2.An unsupervised feature selection algorithm is proposed based on the feature-level self-representation and Principal Component Analysis(PCA)regularization embedding.A classical subspace learning method(i.e.,PCA)is embedded into a sparse feature selection framework,so that the proposed model can preserve the principal information of data as well as learn the importance of features.By this way,the redundant features can be deleted.3.A feature selection algorithm is proposed based on adaptive structure learning and low-rank constraint.This method employs a low-rank constraint to capture inherent global structure of data from the original data generally with noise data,and utilizes the adaptive structure learning to capture the intrinsic local structure of data,so as to provide comprehensive information for model training to improve the accuracy of feature selection.Benchmark data sets are used to evaluate the performance of proposed methods,and the results show that our proposed methods outperform state-of-the-art comparison methods in terms of different evaluation criteria.
Keywords/Search Tags:feature selection, subspace learning, dimensionality reduction, sparse leaning
PDF Full Text Request
Related items