Font Size: a A A

Unsupervised Multi-view Feature Selection Based On Adaptive Similarity

Posted on:2022-09-10Degree:MasterType:Thesis
Country:ChinaCandidate:X Y LiuFull Text:PDF
GTID:2518306542479464Subject:Data Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the development of big data-related technologies,not only the dimensionality of data has increased,but the amount of calculation has also increased exponentially.Feature selection is one of the methods to solve this problem.According to the source of data,feature selection is divided into single-view feature selection and multi-view feature selection.According to whether labels are used in the classification model,feature selection is divided into three types:supervised,semi-supervised,and unsupervised.Because multi-view data can play the advantages of each view,it has received widespread attention.However,the supervised feature selection method has a high cost of tag acquisition,so the unsupervised feature selection method has received widespread attention.However,the current unsupervised multi-view feature selection method has problems such as ignoring the feature correlation between the views and the correlation between the features of different views,the algorithm is low in robustness,the calculation cost is high,the data is noisy,and the accuracy of the classification model is low.In order to solve the above problems,this paper conducts a research on unsupervised multi-view feature selection.The research content mainly includes:First of all,in order to solve the problem of poor feature selection robustness,this paper introduces the loss function into the machine learning model through adaptive learning,and uses the dual relationship between loss and difficulty to weight samples.This weighting format can enable relevant data analysis The model is more robust.Adaptive learning relies on the predicted label information to adaptively update the Laplace weight map.The updated weight map can more accurately express image features and improve the robustness of feature selection.Secondly,in order to solve the problems of weak correlation between features of the same category and low algorithm robustness,this paper solves the above problems based on graph regularization.In this paper,the high-dimensional data is represented as a nearest neighbor graph,and the combination of the nearest neighbor graph and regularization constraints can better protect the hidden internal structure of the data.Graph regularization can use the local geometric characteristics of the data to make the relationship between the features of the same category closer,thereby increasing the robustness of the algorithm.Finally,in order to solve the problem that noise has a greater influence on feature selection in a specific view,this paper uses sparse norm to reduce the influence of noise on feature selection.L1/2 regularization is a kind of non-convex optimization problem.By constructing a definition function,it is solved by an iterative algorithm,so that the noise in the data is reduced and the restoration accuracy is high.Therefore,this paper introduces the L1/2sparse norm to improve the accuracy of the classification model while reducing noise.Based on the above research,this paper proposes an optimization algorithm to solve the objective function.Finally,feature extraction is performed on MSRC-v1 data set,Outdoor Scene data set,Handwritten Numeral data set,You Tube data set and clustering based on the extracted features.The experimental results show that the proposed feature extraction method is in the standardized mutual information(NMI).)And clustering accuracy(ACC)are better than other methods.
Keywords/Search Tags:Feature selection, graph regularization, adaptive learning, multi-view, unsupervised
PDF Full Text Request
Related items