Font Size: a A A

Research On Feature Selection Algorithms Based On Local Learning

Posted on:2019-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:S AnFull Text:PDF
GTID:2428330566987755Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
We are now in the era of big data,where huge amounts of high-dimensional data become ubiquitous in a variety of domains.When machine learning algorithms are applied on high-dimensional data,a critical issue is known as the curse of dimensionality.Feature selection is one of the most powerful tools to address the previously described issues.Among many available feature selection approaches,the ones on the basis of local learning receive more attention due to their low computational costs and high accuracies in analyzing the high-dimensional data.Although the feature selection approaches based on local learning have achieved good dimensionality reduction in practical applications,there are still some complicated problems hard to be solved:?1?the existing supervised feature selection approaches are unable to satisfyspe "electing correct nearest neighbors" and "defining the cost function directly related to the classifier" simultaneously;?2?the existing unsupervised feature selection approaches are incapable of satisfying the "preserving reliable locality information" and "achieving excellent cluster separation" simultaneously.Specific to the above two problems in the existing feature selection approaches based on local learning,two effective feature selection models are designed and implemented on the basis of large margin theory and spectral graph theory in this paper.First,a new supervised feature selection model on the basis of local nearest neighbors is designed in this paper for tackling the problem that is unable to satisfy "selecting correct nearest neighbors" and "defining the cost function directly related to the classifier" simultaneously.The proposed model is based on the trick of locally min-imizing the within-class distances and maximizing the between-class distances.The classick NN rule is embedded into the model for optimizing the margins between every query instance and other instances in its neighborhood.Moreover,a feature weight vector is further defined,and construct it by minimizing the cost function with a regularization term.The cost function employs the probability model and the 1regularization term to select real nearest neighbors and remove the irrelevant features.Experimen-tal results on the various data sets validate the effectiveness and efficiency of the new model.Second,a new unsupervised feature selection model on the basis of joint clustering analysis is designed in this paper for tackling the problem that is incapable of satisfying the "preserving reliable locality information" and "achieving excellent cluster separation" simultaneously.In this model,spectral clustering and orthogonal basis clustering are integrated into a robust joint clustering analysis.Concretely,an adaptive process with probabilistic neighbors is introduced to preserve reliable locality information in spectral clustering and an orthogonal basis matrix is incorporated to achieve excellent cluster separation in orthogonal basis clustering.In order to find discriminative features,the proposed model performs the robust joint clustering analysis and the regularized 2,1feature selection simultaneously.Comprehensive experiments demonstrates that the proposed feature selection model achieves good selection results under various evaluation metrics.
Keywords/Search Tags:feature selection, large margin theory, spectral graph theory, spectral clustering, orthogonal basis clustering
PDF Full Text Request
Related items