Font Size: a A A

Research On Feature Learning Method Based On Similarity Learning

Posted on:2024-12-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:D ShiFull Text:PDF
GTID:1528307328966919Subject:Computer Science and Technology
Abstract/Summary:
In the context of rapid technological development today,data analysis and modeling face challenges in processing high-dimensional and diverse data.The emergence of these complex data not only puts higher demands on the efficiency and performance of models but also promotes the development of low-dimensional feature learning methods.The purpose of these methods is to remove redundant information and noise from the data,obtain low-dimensional features with high information content and strong representation ability,and thus improve the understanding and generalization ability of machine learning models.This thesis focuses on feature learning in semisupervised and unsupervised environments,exploring how to obtain valuable low-dimensional features by relying on the intrinsic structure of data where labeled samples are scarce or completely unlabeled.Graph theory plays an important role here,revealing the inherent structure of data through the construction of correlation relationships and spectral embedding,guiding models to learn discriminative features.This thesis delves into the feature learning methods based on similarity learning and points out the following issues: in the semi-supervised environment,the problem of insufficient supervision information and the joint use of labeled and unlabeled data;The problem of discriminative semantic information mining in the unsupervised environment;The processing problem of multi-view data and out-of-sample data;The similarity learning method suffers from high computational complexity and inaccurate modeling of data association relationships.To solve these problems,this thesis proposes four research works,with the main research contents as follows:(1)Semi-supervised feature selection: A semi-supervised feature selection model based on binary label learning is proposed.Since the number of labeled samples significantly affects the performance of semi-supervised feature selection,this thesis combines binary hash and label learning to introduce them into the research field of feature selection.The model imposes a binary hash constraint during spectral embedding,learns the binary hash code as the pseudo-label,and thus increases the number of labels.At the same time,a self-weighted sparse regression module is proposed,which uses the learned pseudo labels and the given partial manual labels,and distinguishes their importance differences to guide the feature selection process.Finally,the model develops an efficient discrete optimization method based on the alternating direction multiplier method to iteratively optimize pseudo labels and feature selection matrices.A large number of experiments have proved that the model has significant advantages in various aspects.(2)Unsupervised feature selection: An unsupervised adaptive feature selection model based on binary hash is proposed.Because many real-world data,such as images and videos,are often annotated with multiple labels.There is no label guide or only a single label guide during feature selection,which may lead to serious information loss,resulting in the lack of semantic information for the selected features.To solve this problem,the model learns binary hash codes as weakly supervised multiple labels and uses these labels to guide feature selection.To utilize discriminative information in unsupervised scenarios,the model automatically learns weakly supervised multiple labels by applying discrete constraints during spectral embedding.The number of these multiple labels(i.e.,the number of "1" in the binary hash codes)is determined adaptively based on the specific data content.In addition,to enhance the discriminant ability of binary labels,the model constructs dynamic similarity graph adaptively to model the internal structure of data.Finally,the algorithm is extended to multi-view settings,and a multi-view feature selection model based on binary hash is formed to deal with the multi-view feature selection problem.In the model,a binary optimization method based on the augmented Lagrange multiplier method is derived to solve the problem iteratively.Numerous experiments on benchmark datasets show that the model exhibits state-of-the-art performance on both single-view and multi-view feature selection tasks.(3)Unsupervised multi-view feature selection: An adaptive collaborative soft label learning model is proposed for unsupervised multi-view feature selection.The learning methods based on graph theory and hard pseudo-labels have the problems of high computational complexity and serious information loss when dealing with large-scale real scenes.To solve these problems,the model integrates collaborative soft label learning and multi-view feature selection into a unified framework.The model explores the fuzziness between multi-view data,learns soft pseudo-labels from each view feature in a simple and effective way,and fuses them into a collaborative soft label matrix using an adaptive weighting strategy.This matrix is further used to guide the feature selection process to identify valuable features.A large number of experiments have proved the superiority of this model in the accuracy and efficiency of feature selection.(4)Unsupervised multi-view feature learning: A flexible multi-view feature learning model for data clustering is proposed.Existing methods have the problems of unstructured graph,separation of graph learning and feature learning tasks,and can not effectively solve the out-ofsample multi-view data.In particular,they can not distinguish the discriminant ability of multiview features effectively in multi-view learning and processing out-of-sample multi-view data.To solve these problems,the model designs an adaptive learning scheme for structured graph learning,multi-view graph fusion,and out-of-sample data expansion.The model adaptively uses the complementarity of different view features to learn fusion graphs with good clustering structures under the guidance of appropriate rank constraints.At the same time,the model handles out-ofsample extension by learning multiple projection matrices and adaptively adjusting the view combination weights according to the specific content of the out-of-sample data.Finally,the model deduces an alternate optimization strategy to ensure the convergence of the developed unified learning model.A large number of experiments have proved the superiority of this model over the traditional multi-view feature learning models.
Keywords/Search Tags:similarity learning, feature learning, multi-view learning, adaptive learning, pseudo label
Related items