Font Size: a A A

Low-Rank Feature Selection Algorithm Based On Sparse Learning And Hypergraph

Posted on:2018-03-31Degree:MasterType:Thesis
Country:ChinaCandidate:G LuoFull Text:PDF
GTID:2348330518456558Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The theory of data mining is a process of finding and extracting useful rules from a large number of complex data,forming useful patterns and producing value.Along with the concept of'big data' put forward,especially in recent years of development of modern science and technology,we are entering a extremely rich data resource era.Data mining technology is getting more and more important,And it plays an important role in many fields,such as industry development,health care,information industry and so on.As the data dimension increase,the problems become worse.Such as the redundancy between features which may increase the storage space of the data.Generally,the high-dimensional data should not be used in practical applications directly,and high-dimensional data processing will greatly increase the time and space complexity of data processing.Therefore,that how to effectively and efficiently use high-dimensional data in the data preprocessing process is a major challenge.While the high-dimensional data is not unorganized,and dimensional reduction of high-dimensional data can narrow the data dimension.Feature selection method can choose a small set of important and representative features as a new feature set,and still have the original high-dimensional data structure,which even can improve the classification accuracy,thus it has been become an important area in machine learning.Currently there are two common methods of dimensional reduction,subspace learning method and feature selection method.The purpose of Subspace Learning method is to project high-dimensional data into low-dimensional data space,thus preserving the correlation between the datum.The feature selection method is to rank each feature through a preset criterion,and then select a important subset which represent the original features.Feature selection method is a very important technology,so it is widely used in pattern recognition,machine learning and other fields.There are two most familiar methods of feature selection,sparse logistic regression and t-test.Recently,some researchers used the low rank regression model for the feature selection algorithm.Low rank regression model is a novel and meaningful subspace clustering method.It is widely used in machine learning and computer vision and other fields,and achieve satisfactory results.However,as confronted with the input large data features,the traditional regression model shows very low performance when analyzing the high-dimensional data.Secondly,the general performance of the data is very low,and the low-rank regression method is used directly in the real application.The linear regression model does not consider its correlation between the different responses,and its typical representation is the least squares regression,which only produces a response to each predicted data,respectively.Therefore,this paper proposes a novel feature selection algorithm combined together sparse learning,hypergraph and low rank for data mining applications,which is to handle the problem that the original data has missed samples,abnormal samples,noise samples and high dimension.In this paper,we first use the low rank feature selection model directly in the framework of linear regression model.The low rank feature selection model combines two approaches:low rank representation method and sparse representation method.Then,in order to make the features selected by the model preserve the local information of the data reliable,we embed a Laplacian matrix based on the hypergraph in the model to maintain a deeper relationship between the features.And for the sake of guaranteeing the features chosen by the model more representative,we embed the classical subspace learning method-LDA algorithm,as well as in order to tune the result of the model.Finally,a new optimization method is explored,that perform the low rank feature selection and the subspace learning method for the objective function executed in sequence,and iteratively perform this process to make the results approach the optimal,and finally obtain the global optimal solution.The SLH algorithm proposed in this paper combines the advantages of sparse learning,low rank hypergraph and subspace learning for regression analysis and classification.The algorithm can achieve better results in regression model and classification experiment.
Keywords/Search Tags:feature selection, regression analysis, classification, hypergraph, sparse learning, data mining, low-rank constraint, subspace learning
PDF Full Text Request
Related items